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TITLE OF THE INVENTION 

ENZYM&BASED G PROTEIN-COUPLED RECEPTOR ASSAY 



BACKGROUND OF THE INVENTION 
This application claims the benefit Smm Provisional Application Serial No. 
60/1 80,669, filed Febiuary 7, 2000. The entirety of that provisional application is 
incorporated herein by reference. 

Field of the Invention 

This invention relates to methods of detecting G-protein-coupled receptor (GPCR) 
activity, and provides methods of assaying GPCR activity and methods for screening for 
GPCR hgands, G-protein-coupIed receptor kinase (GRK) activity, and compounds th^t 
interact with components of the GPCR regulatory process. 

The actions of many extracellular signals are mediated by the interaction of G-protein- 
coupled receptors (GPCRs) and guanine nucleotide-binding regulatory proteins (G-proteins). 
G-protein-mediated signahng systems have been identified in many divergent oi^anisms, 
such as mammals and yeast. The GPCRs represent a large super family of proteins which 
have divergent amino acid sequences, but share common structural features, in particular, the 
presence of seven transmembrane helical domains. GPCRs respond to, among other 
extracellular signals, neurotransmitters, hormones, odorants and light. Individual GPCR 
types activate a particular signal transduction pathway; at least ten different signal 
transduction pathways are known to be activated via GPCRs. For example, the beta 2- 
adrenergic receptor (P2AR) is a prototype mammalian GPCR. In response to agonist binding, 
p2AR receptors activate a G-prolein (Gs) which in turn stimulates adenylate cyclase activity 
and results in increased cyclic adenosine monophosphate (cAMP) production in the cell. 
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The signaling pathway and final cellular response that result from GPCR stimulation 
depends on the specific class of G-protein with which the particular receptor is coupled 
QJamni, 'The many faces of G-Protein Signaling" J. Biol. Chem., 273:669-672 (1998)), For 
instance, coupling to the Gs class of G-proteins stimulates cAMP production and activation 
of Protein Kinase A and C pathways, whereas coupling to the Gi class of G-proteins down 
regulates cAMP. Other second messenger systems as calcium, phosphlipase C, and 
phosphatidylinositol 3 may also be utilized. As a consequence, GPCR signaling events have 
predominantly been measured via quantification of these second messenger products. 

A common feature of GPCR physiology is desensitization and recycling of the 
receptor through the processes of receptor phosphorylation, endocytosis and 
dephosphorylation (Ferguson, et aL . "G-protein-coupled receptor regulation: role of G- 
protein-coupled receptor kinases and arrestins/' Can. J. Physiol. Pharmacol, 74:1095-1 1 10 
(1996)). Ligand-occupied GPCRs can be phosphorylated by two families of serine/threonine 
kinases, the G-protein-coupled receptor kinases (GRKs) and the second messenger-dependent 
protein kinases such as protein kinase A and protein kinase C. Phosphorylation by either 
class of kinases serves to down-regulate the receptor by uncoupling it from its corresponding 
G-protein. GRK-phosphorylation also serves to down-regulate the receptor by recruitment of 
a class of proteins known as the arrestins that bind the cytoplasmic domain of the receptor 
and promote clustering of the receptor into endocytic vescicles. Once the receptor is 
endocytosed, it will either be degraded in lysosomes or dephosphorylated and recycled back 
to the plasma membrane as fully-functional receptor. 

Binding of an arrestin protein to an activated receptor has been documented as a 
common phenomenon for a variety of GPCRs ranging from rhodopsin to P2AR to the 
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neurotensin receptor (Barak, et al.. "A P-arrestin/Green Fluorescent fusion protein biosensor 
for detecting G-Protein-Coupled Receptor Activation," J. Biol. Chem., 272:27497-500 
(1997)). Consequently, monitoring arrestin interaction with a specific GPCR can be utilized 
as a generic tool for measuring GPCR activation. Similarly, a single G-protein and GRK also 
partner with a variety of receptors (Hamm, et al. (1 998) and Pitcher et al.> "G-Protein- 
Coupled Receptor Kinases," Annu. Rev. Biochem., 67:653-92 (1 998)), such that these 
protein/protein interactions may also be monitored to determine receptor activity. 

The present invention involves the use of a proprietary technology (ICAST™, 
Intercistronic Complementation Analysis Screening Technology) for monitoring 
protein/protein interactions in GPCR signaling. The method involves using two inactive P- 
galactosidase mutants, each of which is fused with one of two interacting proteiri pairs, such 
as a GPCR and an arrestin. The formation of an active P-galactosidase complex is driven by . 
interaction of the target proteins. In this system, P-galactosidase activity acts as a read out of 
GPCR activity. FIGURE 23 is a schematic depicting the method of the present invention. 
FIGURE 23 shows two inactive mutants that become active when they interact. In addition, 
this technology could be used to monitor GPCR-mediated signaling pathways via other 
downstream signaling components such as G-proteins, GRKs or c-Src. 

Many therapeutic drugs in use today target GPCRs, as they regulate vital 
physiological responses, including vasodilation, heart rate, bronchodilation, endocrine 
secretion and gut peristalsis. See, Lefkowitz et al.. Aimu. Rev. Biochem., 52:159 (1983). 
For instance, drugs targeting the highly studied GPCR, p2AR are used in the treatment of 
anaphylaxis, shock hypertension, asthma and other conditions. Some of these drugs mimic 
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the ligand for this recq)tor. Other drugs act to antagonize the receptor in cases when disease 
arises from spontaneous activiiy of the receptor. 

Efforts such as the Human Genome Project are identifying new GPCRs ("orphan" 
receptors) whose physiological roles and ligands are unknown. It is estimated that several 
thousand GPCRs exist in the human genome. Of the 2S0 GPCRs identified to date, only 150 
have been associated with ligands. 



SUMMARY OF THE INVENTION 
A first aspect of the present invention is a method that monitors GPCR function 
proximally at the site of receptor activation, thus providing more information for drug 
discovery purposes due to fewer competing mechanisms. Activation of the GPCR is 
measured by a read-out for interaction of the receptor with a regulatory component such as 
arrestin, G-protein, GRK or other kinases, the binding of which to the receptor is dependent 
upon agonist occupation of the receptor. Protein/protein interaction is detected by 
complementation of reporter proteins such as utilized by the ICAST™ technology. 

A further aspect of the present invention is a method of assessing G-protein-coupled 
receptor (GPCR) pathway activity under test conditions by providing a test cell that expresses 
a GPCR, e.g.> muscarinic, adrenergic, dopamine, angiotensin or endothelin, as a fusion 
protein to a mutant reporter protein and interacting, ije., G-proteins, arrestin or GRK, as a 
fusion protein with a complementing reporter protein. When test cells are exposed to a 
known agonist to the target GPCR under test conditions, activation of the GPCR will be 
monitored by complementation of the reporter enzyme. Increased reporter enzyme activity 
reflects interaction of the GPCR with its interacting protein partner. 
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A further aspect of the present invention is a method of assessing GPCR pathway 
activity in the presence of a test kinase. 

A further aspect of the present invention is a method of assessing GPCR pathway 
activity in the presence of a test G-protein. 

A further aspect of the present invention is a method of assessing GPCR pathway 
activity upon exposure of the test cell to a test ligand. 

A further aspect of the present invention is a method of assessing GPCR pathway 
activity upon co-expression in the test cell of a second receptor. 

A further aspect of the present invention is a method for screening for a ligand or 
agonists to an orphan GPCR. The ligand or agonist could be contained in natural or synthetic 
libraries or mixtures or could be a physical stimulus. A test cell is provided that expresses the 
orphan GPCR as a fusion protein with one P-galactosidase mutant and, for example, an 
arrestin or mutant form of arrestin as a fusion protein with another P-galactosidase mutant. 
The interaction of the arrestin with the orphan GPCR upon receptor activation is measured by 
enzymatic activity of the complemented P-galactosidase. The test cell is exposed to a test 
compound, and an increase in P-galactosidase activity indicates the presence of a ligand or 
agonist. 

A further aspect of the present invention is a method for screening a protein of 
interest, for example, an arrestin protein (or mutant form of the arrestin protein) for the ability 
to bind to a phosphorylated, or activated, GPCR. A cell is provided that expresses a GPCR 
and contains P-arrestin. The cell is exposed to a known GPCR agonist and then reporter 
enzyme activity is detected. Increased reporter enzyme activity indicates that the P-arrestin 
molecule can bind to phosphorylated, or activated, GPCR in the test cell. 
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A further aspect of the present invention is a method to screen for an agonist to a 
specific GPCR. The agonist could be contained in natural or synthetic libraries or could be a 
physical stimulus. A test cell is provided that expresses a GPCR as a fusion protein with one 
P-galactosidase mutant and, for example, an arrestin as a fusion protein with another P- 
5 galactosidase mutant The interaction of arrestin with the GPCR upon receptor activation is 
measured by enzymatic activity of the complemented P-galactosidase. The test cell is 
exposed to a test compound, and an increase in p-galactosidase activity indicates the presence 
of an agonist. The test cell may express a known GPCR or a variety of known GPCRs, or 
may express an unknown GPCR or a variety of unknown GPCRs. The GPCR may be, for 

10 example, an odorant GPCR or a pAR GPCR. 

A further aspect of the present invention is a method of screening a test compound for 
G-protein-coupled receptor (GPCR) antagonist activity. A test cell is provided that expresses 
a GPCR as a fusion protein with one p-galactosidase mutant and, for example, an arrestin as a ^ 
fusion protein with another p-galactosidase mutant. The interaction of arrestin with the 

15 GPCR upon receptor activation is measured by enzymatic activity of the complemented p- 

galactosidase. The test cell is exposed to a test compound, and an increase in p-galactosidase 
activity indicates the presence of an agonist. The cell is exposed to a test compound and to a 
GPCR agonist, and reporter enzyme activity is detected. When exposure to the agonist occurs 
at the same time as or subsequent to exposure to the test compound, a decrease in P- 

20 galactosidase activity after exposure to the test compound indicates that the test compound 
has antagonist activity to the GPCR. 

A further aspect of the present invention is a method of screening a sample solution 
for the presence of an agonist, antagonist or ligand to a G-protein-coupled receptor (GPCR). 
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A test cell is provided that expresses a GPCR fusion and contains, for example, a P-airestin 
protein fusion. The test cell is exposed to a sample solution, and reporter enzyme activity is 
assessed! Changed reporter enzyme activity after exposure to the sample solution indicates 
the sample solution contains an agonist, antagonist or ligand for a GPCR expressed in the cell. 

A further aspect of the present invention is a method of screening a cell for the 
presence of a G-protein-coupled receptor (GPCR). 

A further aspect of the present invention is a method of screening a plurality of cells 
for those cells which contain a G-protein coupled receptor (GPCR). 

A further aspect of the invention is a method for mapping GPCR-mediated signaling 
pathways. For instance, the system could be utilized to monitor interaction of c-src with P- 
arrestin-1 upon GPCR activation. Additionally, the system could be used to monitor 
protein/protein interactions involved in cross-talk between GPCR signaling pathways and 
other pathways such as that of the receptor tyrosine kinases or Ras/Raf 

A further aspect of the invention is a method for monitoring homo- or hetero* 
dimerization of GPCRs upon agonist or antagonist stimulation. 

A further aspect of the invention is a method of screening a cell for the presence of a 
G-protein-coupled receptor (GPCR) responsive to a GPCR agonist. A cell is provided that 
contains protein partners that interact downstream in the GPCR's pathway. The protein 
partners are expressed as fusion proteins to the mutant, complementing enzyme and are used 
to monitor activation of the GPCR. The cell is exposed to a GPCR agonist and then 
enzymatic activity of the reporter enzyme is detected. Increased reporter enzyme activity 
indicates that the cell contains a GPCR responsive to the agonist. 
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The invention is achieved by using ICAST™ protein/protein interaction screening to 
map signaling pathways. This technology is ^plicable to a variety of known and unknown 
GPCRs with diverse functions. They include, but are not limited to, the following sub- 
families of GPCRs: 

(a) receptors that bind to amine-like ligands-Acetylcholine muscarinic receptor (Ml 
to M5), alpha and beta Adrenoceptors, Dopamine, receptors (Dl, D2, D3 and D4), Histamine 
receptors (HI and H2), Octopamine receptor and Serotonin receptors (5HT1, 5HT2, 5HT4, 
5HT5, 5HT6, 5HT7); 

(b) receptors that bind to a peptide ligand- Angiotensin receptor, Bombesin receptor, 
Bradykinin receptor, C-C chemokine receptors (CCRl to CCR8, and CCRIO), C-X-C type 
Chemokine receptors (CXC-R5), Cholecystokinin type A receptor, CCK type receptors, 
Endothelin receptor, Neurotesin receptor, FMLP-related receptors. Somatostatin recq)tors 
(type 1 to type 5) and Opioid receptors (type D, K, M, X); 

(c) receptors that bind to hormone proteins- FoUic stimulating hormone receptor, 
Thyrotrophin receptor and Lutropin-choriogonadotropic hormone receptor; 

(d) receptors that bind to neurotransmitters-substance P receptor. Substance K 
receptor and neuropeptide Y receptor; 

(e) Olfactory recq>tors-01factoiy type 1 to type 1 1, Gustatory and odorant receptors; 

(f) Prostanoid receptors-Prostaglandin E2 (EPl to EP4 subtypes). Prostacyclin and 
Thromboxane; 

(g) receptors that bind to metabotropic substances-Metabotropic glutamate group I to 
group III receptors; 
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(h) receptors that respond to physical stimuli, such as light, or to chemical stimuli, 
such as taste and smell; and 

(i) orphan GPCRs-the natural ligand to the receptor is undefined. 

ICAST™ provides many benefits to the screening process, including the ability to 
5 monitor protein interactions in any sub-cellular compartment-membrane, cytosol and nucleus; 
the ability to achieve a more physiologically relevant model without requiring protein 
overexpression; and the ability to achieve a functional assay for receptor binding allowing 
high information content. 

BRIEF DESCRIPTION OF THE DRAWINGS 

10 FIGURE 1 . Cellular expression levels of P2 adrenergic receptor (p2AR) and p- 

arrestin-2 (PArr2) in C2 clones. Quantification of P-gal fusion protein was performed using 
antibodies against P-gal and purified p-gal protein in a titration curve by a standardized 
ELISA assay. Figure 1 A shows expression levels of p2AR-PgalAa clones (in expression 
vector pICAST ALC). Figure IB shows expression levels of pAiT2-PgalAcD in expression 

15 vector pICAST OMC4 for clones 9-3, -7, -9, -10, -19 and -24, or in expression vector 
pICAST 0MN4 for clones 12-4, -9, -16, -18, -22 and -24. 

FIGURE 2. Receptor P2AR activation was measured by agonist-stimulated cAMP 
production. C2 cells expressing pICAST ALC p2AR (clone 5) or parental cells were treated 
with increasing concentrations of (-)isoproterenol and 0.1 mM IBMX. The quantification of 

20 cAMP level was expressed as pmol/welL 
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FIGURE 3. Interaction of activated receptor p2AR and arrestin can be measured by 
P-galactosidase complementation. Figure 3A shows a time course of p-galactosidase activity 
in response to agonist (-)isoproterenol stimulation in C2 expressing p2AR'pgalAa (P2AR 
alone, in expression vector pICAST ALC), or C2 clones, and a pool of C2 co-expressing 
P2AR-PgalAa and pArr2-pgalA(D (in expression vectors pICAST ALC and pICAST OMC). 
Figure 3B shows a time course of p galactosidase activity in response to agonist 
(-)isoproterenol stimulation in C2 cells expressing p2AR alone (in expression vector pICAST 
ALC) and C2 clones co-expressing P2AR and pArrl (in expression vectors ICAST ALC and 
pICAST OMC). 

FIGURE 4. Agonist dose response for interaction of P2AR and arrestin can be 
measured by p-galactosidase complementation. Figure 4A shows a dose response to agonists 
(-)isoproterenol and prpcaterol in C2 cells co-expressing pICAST ALC P2AR and pICAST 
OMC PArr2 fusion constructs. Figure 4B shows a dose response to agonists (-)isoproterenol 
and procaterol in C2 cells co-expressing pICAST ALC P2AR and pICAST OMC pArrl 
fusion constructs. 

FIGURE 5. Antagonist mediated inhibition of receptor activity can be measured by 
P-galactosidase complementation in cells co-expressing p2AR-PgalAa and PArr-PgalAco. 
Figure 5A shows specific inhibition with adrenergic antagonists ICM 18,551 and propranolol 
of P-galactosidase activity in C2 clones co-expressing pICAST ALC p2AR and pICAST 
OMC pArr2 fusion constructs after incubation with agonist (-)isoproterenol. Figure 5B 
shows specific inhibition of P-galactosidase activity with adrenergic antagonists ICI-1 18,551 
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and propranolol in C2 clones co-expressing pICAST ALC P2AR and pICAST OMC PAirl 
fusion constructs in the presence of agonist (-)isoproterenol. 

FIGURE 6. C2 cells expressing adenosine receptor A2a show cAMP induction in 
response to agonist (CGC-21680) treatment. C2 parental cells and C2 cells co-expressing 
pICAST ALC A2aR and pICAST OMC p Arrl as a pool or as selected clones were measured 
for agonist-induced cAMP response (pmol/well). 

FIGURE 1, Agonist stimulated cAMP response in C2 cells co-expressing Dopamine 
receptor Dl (Dl-PgalAa) and P-arrestin-2 (PArr2-PgalAa)). The clone expressing PArr2- 
PgalAcD (Arr2 alone) was used as a negative control in the assay. Cells expressing Dl- 
pgalAa in addition to PArr2-pgalAco responded agonist treatment (3-hydroxytyramine 
hydrochloride at 3 \M) . D1(PIC2) or Dl (PIC3) designate Dl in expression vector pICAST 
ALC2 or pICAST ALC4, respectively. 

FIGURE 8. Variety of mammalian cell lines can be used to generate stable cells for 
monitoring GPCR and arrestin interactions. FIGURE 8A, FIGURE 8B and FIGURE 8C show 
the examples of HEK293, CHO and CHW cell lines co-expressing adrenergic receptor P2AR 
and arrestin fusion proteins of p-galactosidase mutants. The p-galactosidase activity was used 
to monitor agonist-induced interaction of P2AR and arrestin proteins. 

FIGURE 9. Beta-gal complementation can be used to monitor P2 adrenergic receptor 
homo-dimerization. FIGURE 9A shows p-galactosidase activity in HEK293 clones co- 
expressing pICAST ALC p2AR and pICAST OMC P2AR. FIGURE 9B shows a cAMP 
response to agonist (-)isoproterenol in HEK 293 clones co-expressing pICAST ALC p2AR 
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and pICAST OMC p2AR. HEIC293 parental cells were included in the assays as negative 
controls. 

FIGURE 1 OA. pICASTALC: Vectorfor expression of P-galAa as a C-terminal 
fusion to the target protein. This construct contains the following features: MCS, multiple , 
cloning site for cloning the target protein in firame with the P-galAa; GS Linker, (GGGGS)n; 
NeoR, neomycin resistance gene; IRES, intemal ribosome entry site; ColElori, origin of . 
replication for growth in E. coli; 5'MoMuLV LTR and 3'MoMuLV LTR, viral promotor and 
polyadenylation signals from the Moloney Murine leukemia virus. . 

FIGURE lOB. Nucleotide sequence for pICAST ALC. 

FIGURE 11 A. pICAST ALN: Vector for expression of P-galAa as an N-terminal 
fusion to the target protein. This construct contains the following featiu-es: MCS, multiple ■ 
cloning site for cloning the target protein in frame with the p-galAa; GS Linker, (GGGGS)n; 
NeoR, neomycin resistance gene; IRES, intemal ribosome entry site; ColElori, origin of 
replication for growth in E. coli; 5'MoMuLV LTR and 3'MoMuLV LTR, viral promotor and 
polyadenylation signals from the Moloney Murine leukemia virus. 

HGURE 1 1 B. Nucleotide sequence for pICAST ALN. 

FIGURE 12A. pICAST OMC: Vector for expression of P-galAco as a C-termmal 
fusion to the target protein. This construct contains the following features: MCS, multiple 
cloning site for cloning the target protein in frame with the P-galAo) ; GS Linker, (GGGGS)n; 
Hygro, hygromycin resistance gene; IRES, intemal ribosome entry site; ColElori, origin of 
replication for growth in E. coli; 5'MoMuLV LTR and 3'MoMuLV LTR, viral promotor and 
polyadenylation signals from the Moloney Murine leukemia virus. 

-12- 
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FIGURE 12B. Nucleotide sequaice for pICAST OMC. 

FIGURE 13 A. pICAST OMN: Vector for expression of P-galA© as an N-tenninal 
fusion to the target protein. This construct contains the following features: MCS, multiple 
cloning site for cloning the target protein in frame with the P-galAto; GS Linker, (GGGGS)n; 
Hygro, hygromycin resistance gene; IRES, internal ribosome entry site; ColElori, origin of 
replication for growth in E. coli; 5'MoMuLV LTR and 3'MoMuLV LTR, viral promotor and 
polyadenylation signals from the Moloney Murine leukemia vims. 

FIGURE 13B. Nucleotide sequence for pICAST OMN. 

FIGURE 14. pICAST ALCpArr2: Vector for expression of p-galAa as a C-terminal 
fusion to P-arrestin-2. The coding sequence of human p-arrestin-2 (Genebank Accession 
Number: NM_0043 13) was cloned in frame to P-galAa in a pICAST ALC vector. 

FIGURE 1 5. pICAST OMC PAit2: Vector for expression of P-galA© as a C- 
terminal fusion to p-arrestin-2. The coding sequence of human p-arrestin-2 (Genebank 
Accession Number: NM_004313) was cloned in frame to p-galA© in a pICAST OMC vector. 

FIGURE 1 6. pICAST ALC pArr 1 : Vector for expression of p-galAa as a C-terrainal 
fusion to p-airestin-l . The coding sequence of human P-arrestin-l (Genebank Accession 
Number: NM_004041) was cloned in frame to P-galAa in a pICAST ALC vector. 

FIGURE 1 7. pICAST OMC PAirl : Vector for expression of p-galA© as a C- 
terminal fusion to p-amestin-l . The coding sequence of human P-an^stin-l (Genebank 
Accession Number: NM_004041) was cloned in fi^e to p-galA© in a pICAST OMC vector. 

FIGURE 18. pICAST ALC p2AR: Vector for expression of p-galAa as a C-temiinal 
fusion to P2 Adrenergic Receptor. The coding sequence of human p2 Adrenergic Receptor 

-13- 
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(Genebank Accession Number: NM_000024) was cloned in frame to p-galAa in a pICAST 
ALC vector. 

FIGURE 19* pICAST OMC p2AR: Vector for expression of p-galA<D as a C- 
terminal fusion P2 Adrenergic Receptor. The coding sequence of human p2 Adrenergic 
Receptor (Genebank Accession Number: NM_000024) was cloned in frame to p-galAco in a 
pICAST OMC vector. 

FIGURE 20. pICAST ALC A2aR: Vector for expression of p-galAa as a C-terminal 
fusion to Adenosine 2a Receptor. The coding sequence of human Adenosine 2a Receptor 
(Genebank Accession Number: NM_000675) was cloned in frame to P-galAa in a pICAST 
ALC vector. 

FIGURE 21. pICAST OMC A2aR: Vector for expression ofp-galAo as a C-terminal 
fusion to Adenosine 2a Receptor. The coding sequence of human Adenosine 2a Receptor 
(Genebank Accession Number: NM_000675) was cloned in frame to p-galAoo in a pICAST 
OMC vector. 

FIGURE 22. pICAST ALC Dl: Vector for expression ofp-galAa as a C-terminal 
fusion to Dopamine Dl Receptor. The coding sequence of human Dopamine Dl Receptor 
(Genebank Accession Number: X58987) was cloned in frame to p-galAa in a pICAST ALC 
vector. 

FIGURE 23. A schematic depicting the method of the invention, which shows that 
two inactive mutants that become active when they interact. 
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DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS 

All literature and patents cited in this disclosure are incorporated herein by reference* 
The present invention provides a method to interrogate GPCR fiinction and pathways. 
The G-protein-coupled superfamily continues to expand rapidly as new receptors are 
discovered through automated sequencing of cDNA libraries or genomic DNA. It is 
estimated that several thousand GPCRs may exist in the human genome, as many as 250 
GPCRs have been cloned and only as few as 150 have been associated with ligands. The 
means by which these, or newly discovered orphan receptors, will be associated with their 
cognate ligands and physiological functions represents a major challenge to biological and 
biomedical research. The identification of an orphan receptor generally requires an 
individualized assay and a guess as to its function. The interrogation of a GPCR 's signaling 
behavior by introducing a replacement receptor eliminates these prerequisites because it can 
be performed with and without prior knowledge of other signaling events. It is sensitive, 
rapid and easily performed and should be applicable to nearly all GPCRs because the 
majority of these receptors should desensitize by a common mechanism. 

Various approaches have been used to monitor intracellular activity in response to a 
stimulant, e^, enzyme-linked immunosorbent assay (ELISA); Fluorescense Imaging Plate 
Reader assay (FLIPR™, Molecular Devices Corp., Sunnyvale, CA); EVOscreen™, 
EVOTEC™, Evotec Biosystems Gmbh, Hamburg, Germany; and techniques developed by 
CELLOMICS™, Genomics, Inc., Pittsburgh, PA. 

Germino. FJ., et al.. "Screening for in vivo protein-protein interactions." Proc. Natl. 
Acad. Sci., 90(3): 933-7 (1993), discloses an in vivo approach for the isolation of proteins 
interacting with a protein of interest. 
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Phizickv. E.M.> et ah , "Protein-protein interactions: methods for detection and 
analysis." Microbiol. Rev., 59(1): 94-123 (1995), discloses a review of biochemical, 
molecular biological and genetic methods used to study protein-protein interactions. 

Offermanns. et aL . "Gajj and Ga,^ Couple a Wide Variety of Receptors to 
Phospholipase C.** J. Biol. Chem., 270(25): 151 75-80 (1995), discloses that Gaij and Ga,^ can 
be activated by a wide variety of G-protein-coupled receptors. The selective coupling of an 
activated receptor to a distinct pattern of G-proteins is regarded as an important requirement 
to achieve accurate signal transduction. Id. 

Barak et aL, "A P-arrestin/Green Fluorescent Protein Biosensor for Detecting G 
Protein-coupled Receptor Activation." J. Biol. Chem., 272(44):27497-500 (1997) and U.S. 
Patent No. 5,891,646, disclose the use of a p-arrestin/green fluorescent fusion protein (GFP) 
to monitor protein translocation upon stimulation of GPCR. 

The present invention involves a method for monitoring protein-protein interactions in 
GPCR pathways as a complete assay using ICAST™ (Intercistronic Complementation 
Analysis Screening Technology as disclosed in pending U.S. patent application serial no. 
053,164, filed April 1, 1998, the entire contents of which are incorporated herein by 
reference). This invention enables an array of assays, including GPCR binding assays, to be 
achieved directly within the cellular environment in a rapid, non-radioactive assay format 
amenable to high-throughput screening. Using existing technology, assays of this type are 
currently performed in a non-cellular environment and require the use of radioisotopes. 

The present invention combined with Tropix ICAST™ and Advanced Discovery 
Sciences™ technologies, e.g., ultra high-throughput screening, provide highly sensitive cell- 
based methods for interrogating GPCR pathways which are amendable to high-throughput 
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screening (UTS). These methods are an advancement over the invention disclosed in U.S. 
Patent 5,891,646, which relies on microscopic imaging of GPCR components as fusion with 
Green-fluorescent-protein. Imaging techniques are limited by low-throughput, lack of 
thorough quantification and low signal to noise ratios. Unlike yeast-based-2-hybrid assays 
used to monitor protein/protein interactions in high-throughput assays, the present invention 
is applicable to a variety of cells including mammalian cells, plant cells, protozoa cells such 
as E. coli and cells of invertebrate origin such as yeast, slime mold {Dictyostelium) and 
insects; detects interactions at the site of the receptor target or downstream target proteins 
rather than in the nucleus; and does not rely on indirect read-outs such as transcriptional 
activation. The present invention provides assays with greater physiological relevance and 
fewer false negatives. 

Advanced Discovery Sciences™ is in the business of offering custom-developed 
screening assays optimized for individual assay requirements and validated for automation. 
These assays are designed by HTS experts to deliver superior assay performance: Advanced 
Discovery Sciences'™ custom assay development service encompasses the design, 
development, optimization and transfer of high performance screening assays. Advanced 
Discovery Sciences™ works to design new assays or convert existing assays to ultra-sensitive 
luminescent assays ready for the rigors of HTS. Among some of the technologies developed 
by Advanced Discovery Sciences™ are the cAMP-Screen™ immunoassay system. This 
system provides ultrasensitive determination of cAMP levels in cell lysates. The 
cAMP-Screen™ assay utilizes the high-sensitivity chemiluminescent alkaline phosphatase 
(AP) substrate CSPD® with Sapphire-II™ luminescence enhancer. 
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EXAMPLE; 

GPCR activation can be measured through monitoring the binding of ligand-activated 
GPCR by an arrestin. In this assay system, a GPCR, e.g. p adrenergic receptor (p 2AR) and a 
P arrestin are co-expressed in the same cell as fusion protems with p gal mutants. As 
illustrated in Figure 1, the p2AR is expressed as a fusion protein with Aa form of p gal 
mutant (P2ADRAa) and the b arrestin as a fusion protein with the Aco mutant of P gal (P* 
ArrAco). The two fusion proteins exist inside of a resting (or un-stimulated) cell in separate . 
compartments, i.e. membrane for GPCR and cytosol for arrestin, and they can not form an 
active b galactosidase enzyme. When such a cell is treated with an agonist or a ligand, the 
ligand-occupied and activated receptor will become a high afiinity binding site for Arrestin. 
The interaction between an activated p2ADRAa and p-ArrA© drives the p gal gal mutant 
complementation. The enzyme activity can be measured by using an enzyme substrate, 
which upon cleavage releases a product measurable by colorimetry, fluorescence, 
chemi luminescence (e.g. Tropix product GalScreenTM). 

Experiment protocol- 

1 . In the first step, the expression vectors for p2ADRAa and p Arr2Aci> were 
engineered in selectable retroviral vectors pICAST ALC, as described in Figure 18 and 
pICAST OMC, as in Figure 15. 

2. In the second step, the two expression constructs were transduced into either 
C2C12 myoblast cells, or other mammalian cell lines, such as COS-7, CHO, A431, HEK 293, 
and CHW. Following selection with antibiotic drugs, stable clones expressing both fusion 
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proteins at appropriate levels were selected. 

3. In the last step, the cells expressing both p2ADRAa and pArr2Aa) were tested for 
response by agonist/ligand stimulated p galactosidase activity. Triplicate samples of cells 
were plated at 10,000 cells in 100 microliter volume into a well of 96-well culture plate. Cells 
were cultured for 24 hours before assay. For agonist assay (Figure 3 and 4), cells were treated 
with variable concentrations of agonist, for example, (-) isoproterenol, procaterol, 
dobutamine, terbutiline or L-L-phenylephrine for 60 min at 37 C. The induced p galatosidase 
activity was measured by addition of Tropix GalScreenTM substrate (Applied Biosystems) 
and luminescence measured in a Tropix TR717TM luminometer (Applied Biosystems). For 
antagonist assay (Figure 5), cells were pre-incubated for 10 min in fresh medium without 
serum in the presence of ICI-1 18,551 or propranolol followed by addition of 10 micro molar 
(-) isoproterenol. 

The assays of this invention, and their application and preparation have been 
described both generically, and by specific example. The examples are not intended as 
limiting. Other substituent identities, characteristics and assays will occur to those of 
ordinary skill in the art, without the exercise of inventive faculty. Such modifications remain 
within the scope of the invention, unless excluded by the express recitation of the claims 
advanced below. 
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WHAT IS CLAIMED IS: 

1 . A method of assessing the effect of a test condition on G-protein-coupIed receptor 
(GPCR) pathway activity, comprising: 

a) providing a cell that expresses a GPCR as a fusion protein to one mutant form of 
reporter enzyme and an interacting protein partner as a fusion to another mutant form of 

enzyme; 

b) exposing the cell to a ligand for said GPCR under said test condition; and 

c) monitoring activation of said GPCR by complementation of said reporter enzyme; 
wherein increased reporter enzyme activity in the cell compared to that which occurs 

in the absence of said test condition indicates increased GPCR interaction with its interacting 
protein partner compared to that which occurs in the absence of said test condition, and 
decreased reporter enzyme activity in the cell compared to that which occurs in the absence of 
said test condition indicates decreased GPCR interaction with its interacting protein partner 
compared to that which occurs in the absence of said test condition. 

2. A method according to Claim 1 , wherein the test condition is the presence in the 
cell of a kinase. 

3. A method according to Claim 1, wherein the test condition is the presence in the 
cell of a G-protein. 

4. A method according to Claim 1, wherein the test condition is the exposure of the 
cell to a compound selected from GPCR agonists and GPCR antagonists. 

5. A method according to Claim 1, wherein the test condition is co-expression in the 
cell of a second receptor. 

6. A method according to Claim 5, wherein the second receptor is a GPCR receptor. 
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7. A method according to Claim 5, wherein homo-dimerization of GPCR is 
determined. 

8. A method according to Claim 5, wherein hetero-dimerization of GPCR is 
determined. 

9. A method for screening a p-arrestin protein or an unidentified arrestin or airestin- 
like protein or fragment and mutant fomi thereof for the ability to bind to activated GPCRs, 
comprising: 

a) providing a cell that: 

i) expresses at least one GPCR as a fusion protein to a reporter enzyme; and 

ii) contains a conjugate comprising a test p-arrestin protein as a fusion protein 
with another reporter enzyme; 

b) exposing the cell to a ligand for said at least one GPCR; and 

c) detecting enzymatic activity of the complemented reporter enzyme; 
wherein an increase in enzymatic activity in the cell indicates p-arrestin protein 

binding to the activated GPCR. 

10. A method for screening a test compound for G-protein-coupled receptor (GPCR) 
agonist activity, comprising: 

a) providing a cell that expresses a GPCR as a fusion protein to one mutant form of 
reporter enzyme and an arrestin protein as a fusion to another mutant form of enzyme; 

b) exposing the cell to a test compound; and 

c) detecting complementation of said reporter enzyme; 

wherein increased reporter enzyme activity after exposure of the cell to the test 
compound indicates GPCR agonist activity of the test compound. 
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1 1. A method according to Claim 10, wherein the cell expresses a GPCR whose 
function is known. 

12. A method according to Claim 10, wherein the cell expresses a GPCR whose 
function is imknown. 

13. A method according to Claim 10, wherein the cell expresses an odorant or taste 

GPCR. 

14. A method according to Claim 10, wherein the cell expresses a GPCR a p- 
adrenergic GPCR. 

15. A method according to Claim 10, wherein the cell is selected from the group 
consisting of mammalian cells, cells of invertebrate origin, plant cells and protozoa cells. 

1 6. A method according to Claim 10, wherein the cell endogenously expresses a 

GPCR. 

1 7. A method according to Claim 10, wherein the cell has been transformed to 
express a GPCR not endogenously expressed by such a cell. 

1 8. A method of screening a test compound for G-protein-coupled receptor (GPCR) 
antagonist activity, comprising: 

a) providing a cell that expresses a GPCR as a fusion protein to one mutant form of 
reporter enzyme and an arrestin protein as a fusion to another mutant form of enzyme; 

b) exposing the cell to said test compound; 

c) exposing the cell to an agonist for said GPCR; and 

d) detecting complementation of said reporter enzyme; 

where exposure to the agonist occurs at the same time as, or subsequent to, exposure 
to the test compound, and wherein decreased reporter enzyme activity after exposure of the 
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cell to the test compound indicates that the test compound is an antagonist for said GPCR. 

19. A method of screening a cell for the presence of a G-protein-coupled receptor 
(GPCR) responsive to a GPCR agonist, comprising: 

a) providing a cell, said cell containing a conjugate comprising a P-arrestin protein as 
a fusion protein with a reporter enzyme; 

b) exposing the cell to a GPCR agonist; and 

c) detecting enzymatic activity of the reporter enzyme; 

wherein an increase in enzymatic activity after exposure of the cell to the GPCR 
agonist indicates that the cell contains a GPCR responsive to said agonist 

20. A method of screening a plurality of cells for those cells which contain a G- 
protein-coupled receptor (GPCR) responsive to a GPCR agonist, comprising: 

a) providing a plurality of cells, said cells containing a conjugate comprising a 
P-an-estin protein as a fusion protein with a reporter enzyme; 

b) exposing the cells to a GPCR agonist; and 

c) detecting enzymatic activity of the reporter enzyme; 

wherein an increase in enzymatic activity after exposure to the GPCR agonist 
indicates p-arrestin protein binding to a GPCR, thereby indicating that the cell contains a 
GPCR responsive to said GPCR agonist. 

21 . A method according to Claim 20, wherein the plurality of cells are contained in a 

tissue. 

22. A method according to Claim 20, wherein the plurality of cells are contained in 
an organ. 
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23. A method according to Claim 20, wherein step (b) comprises e>q}osing the cells to 
a plurality of GPCR agonis-'s (»- ligand libraries. 

24. A substrate having deposited thereon a plurality of cells, said cells expressing at 
least one GPCR as a fusion protein to one mutant form of reporter enzyme and an anestin 
protein as a fusion to another mutant form of enzyme. 

25. A substrate according to Claim 24, wherein the substrate contains an enzyme- 
labile chemical group which, upon cleavage by the reporter enzyme, releases a product 
measurable by colorimetry, fluorescence or chemiluminescence. 

26. A substrate according to Claim 24, wherein the substrate is made of a material 

selected from glass, plastic, ceramic, semiconductor, silica, fiber optic, diamond, 
biocompatible monomer and biocompatible polymer materials. 

27. A method of detecting G-protein-coupled receptor (GPCR) pathwi^ activity m a 
cell expressing at least one GPCR and containing P-arrestin protein as a fusion protein with a 
reporter enzyme; wherein said enzymatic activity indicates activation of the GPCR pathway. 

28. A method according to Claim 27, where the cells are deposited on a substrate 
prior to detecting said enzymatic activity. 

29. A method according to Claim 27, wherein said cell is contained in a tissue. 

30. A method according to Claim 27, wherein said cell is contained in an organ. 



-24- 



wo 01/59451 



PCT/USOO/24043 



1/78 



Cellular Expression of /?2AR-j?gaiAa Fusion Protein in C2 Clones 
(measured by anti-/?-gal EUSA) 
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Cellular expression of /?An'--/?gal Aw fusion protein in C2 clones 
(meosured by ontl-/? gal EUSA) 
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Agonist Stimulated cAMP Response in C2 Cells Expressing ^2AR-/7galAa 
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/J-^alactosidase Complementation as a Measurement for/)2AR~/?gai^ 
interacting with /}AiTestin2-/7gal^ upon agonist Stimulation 
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/!?~galactosidase Activity in Response to Agonist in C2 Cells 
Coexpressing/?2AR-^galAaandj?AiTestin2-/?galAw Fusion Proteins 
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/?-galactosidase Activity in Response to Agonist in C2 Cells 
Coexpressing jJ2AR-/?ga! Aot and^Arrestinl-j^golAw Fusion Proteins 
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Inhibition of /?-gaiactosidase activity in C2 Cells Coexpressing 
|72AR -§gQ\ha and jSArrestinZ- fiqa\ tio Fusion Proteins 
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Antagonist Inhibition of /^-galactosidcse Activity in C2 Cells 
Coexpressing /?2AR-/SgalAa and/?Arrestin1-/}golAa) Fusion Proteins 
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Agonist Stimulated cAMP Response in Clones or Pools of C2 Cells 
Coexpressing A2QR-^gQlAaand 
/7Arresb'n1-/}galAa) Fusion Proteins 
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Agonist Stimulated cAMP Response in Clones or Pools of C2 Cells 
Expressing D1-/?galAa and /?Arrestln2-/?galAa) Fusion Proteins 
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/^AR-/?galA£jand/larr2-i}gal^ Interaction in HEK293 
Clones in Response to isoproterenol Treatment (1/tM ) 
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§2fiR-§qQ\ta and/7AiTl-^galAu Interaction in a CHO Pool 
in Response to Isoproterenol Treatment(10/tM) 
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jS-goiactosidase Complementation as a Measurement for 
Adrenergic Receptor Homodimenzation in HEX 293 Cells 
Coexpressing /72AR-/?gal ^ and j32AR-/?gali^. 
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Agonist Stimulated cAMP Response in HEK 293 Cells 
Coexpressing /?2AR-/?gQiM and ^2AR-/?gQlAu 
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pICAST ALC 

1 CTGCAGCCTG MTATGG6CC AAACAGGATA TaGTGGTAA GCAGTTCCTG 
GACGTCGGAC TTATACCCGG TTTGTCCTAT AGACACCATT CSTCAAGGAC 

51 CCCCGGCTCA GGGCCAAGAA CAGATGGAAC AGCTGAATAT GGGCCAAACA 
GGGGCCGAGT CCCGGTTCTT GTCTACCTTG TCGACTTATA CCCGGnTCT 

101 GGATATCTGT GGTAAGCA6T TCCTGCCCCG GQCAGGGCC AAGAACA6AT 
CCTATAGACA CCATTCGTCA AGGACGGGGC CGAGTCCCGG TTCTTGTCTA 

151 GGTCCCCAGA TGCGGTCCAG CCCTCAGCAG TTTCTAGAGA ACCATCAGAT 
CCAGGGGTCT ACGCCAGGTC GGGAGTCGTC AAAGATCTCT TGGTAGrCTA 

201 GTFTCCAGGG TGCCCCAAGG ACCTGAAATG ACCCTGT6CC TTATTTGAAC 
CAAAGGTCCC ACGGGfflTCC TGGACnTAC TGGGACACGG AATAAACTTG 

251 TAACCAATCA GirCGCTTCT CGCTTCTSTT CGCGCGCTTC TGCTCCCCGA 
ATTGGTTAGT CAAGCGAAGA GCGAAGACAA GCGCGCGAA6 AC6AGGGGCT 

301 GCTCAATAAA AGAGCCCACA ACCCCTCACT CGGGGC6CCA GTCCTCCGAT 
CGAGTTATTT TCTCGGGrGT TGGGGAGTGA GCCCCGCGGT CAGGAG6CTA 

351 TGACTGAGTC GCCCGGSTAC CCSreTATCC AATAAACCCT CTTGCAGnrrG 
ACTGACrCAG CGGGaCATG GGCACATAGG TrATTTGGGA GAACGTCAAC 

401 CATCCGACTT GTGGTCTCGC TGITCCTTGG GAGGGTCTCC TCTGAGTGAT 
GTAGGCTGAA CACCAGAGCG ACAAGGAACC CTCCCAGAGG AGACTCACTA 

451 TGACTACCCG TCAGCGGGGG TCTTrCATTT GGGGGCTCGT CCGGGATCGG 
ACTGATGGGC AGJCGCCCCC AGAAAGTAAA CCCCCGAGCA GGCCCTAGCC 

501 GAGACCCCTG CCCAGGGACC ACCGACCCAC CACCGGGAGG CAAGCTGGCC 
CTCTGGGGAC GGGTCCCTGG TGGCTGGGTG GTGGCCaCC GTTCGACCGG 

551 AGCAACTTAT CreTOTCTGr CCGATTSTCT AGTGTCTATG ACTGATnTA 
TCGTTGAATA 6ACACAGACA GGCTAACAGA TCACA6ATAC TGACTAAAAT 

601 TGCGCCTGCG TC6GTACTAG TTAGCTAACT AGCTfTCTAT CTGGCGGACC 
ACGCGGACGC AGCCATGATC AATCGATTGA TCGACiACATA GACCGCCTGG 
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pICAST ALC 

651 CGTGGT6GAA CTGACGAffTT CTGAACACCC G6CCGCAACC CTG6GAGACG 
GCACCACCTT GACTGCTCAA GACTTGTGGG CCG6CGTTGG 6ACCCTCT6C 

701 TCCCAGGGAC nTGGGGGCC ( alMI IUKa G CCCGACCTSA GGAAGGGAGT 
AGGGTCCCTG AAACCCCCGG CAAAAACACC GGGCTGGACT CCTTCCCTCA 

751 CGATGT6GAA TCCGACCCC6 TCAGGATATG TGGTTCTGGT AGGAGACGAG 
GCTACACCIT AGGaGGGGC AGTCCTATAC ACCAAGACCA TCCTaGCTC 

801 AACCTAAAAC AffTTCCCGeC TCCGTaGAA nTTTGCnT CGGTriiGGAA 
TrGGATnTG TCAAGGGCGG AGGCAGACTT AAAAACGAAA GCCAAACCTT 

851 CCGAAGCCGC GCGTCTTGTC TGCTGCAGCA TCGTTCTGTG TTGTaCTGT 
GGCTTCGGCG C6CAGAACAG AC6ACGTCGT AGCAAGACAC AACAGAGACA 

901 CTGACTGrrGT TTCTGTATIT GTCTGAAAAT TAGGGCCAGA CTGHACCAC 
GACTGACACA AAfiACATAAA CAGACTTTTA ATCCCGGTCT GACAATGGTG 

951 TCCOTAAGT TTGACCTTAG GTAACTGGAA AGATGTCGAG CGGCTCGCTC 
AGGGAATTCA AACTGGAATC CATTGACCTT TCTACAGGTC GCCGAGCGAG 

1001 ACAACCAGTG GGTAGATGTC AAGAAGAGAC GTrGGGnrTAC OTCTGaCT 
TGTTCGTCAG CCATaACAG TTCTTCTCTG CAACCCAATG GAAGACGAGA 

1051 GCAGAATGGC CAACCITTAA CGTCGGATGG CCGCGAGAC6 6CACCTTTAA 
CGTCTTACCG GTTGGAAATT GCAGCaACC GGCGCTCTGC CGTGGAAATT 

1101 CCGAGACaC ATCACCCAGG TTAAGATCAA GGrCTTTTCA CCTCGCXCGC 
QGCTCTGGAG TAGTOGGTCC AATTCTAGTT CCAGAAAAGfT GGACCGGGCG 

1151 ATGGACACCC AGACCAGGTC CCCTACATCG T6ACCTGGGA AGCCTTGGCT 
TACCTGTGGG TCTGGTCCAG GGGATGTAGC ACTGGACCCT TCGGAACCGA 

1201 TTTGACCCCC CTCCCTGGGT CMGCCCTTT GTACACCCTA AGCCTCCGCC 
AAACTGGGGG GAGGGACCCA GTTCGGGAAA CATGTGGGAT TCGGAGGCGG 

1251 TCCTCTTCCr CCATCCGCCC CGTCTCTCCC CCTTGAACCT CCTCGTTCGA 
AGGAGAAGGA GGTAGGCGGG GCAGAGAGGG GGAACTTGGA GGAGCAAGCT 
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pICAST ALC 

1301 CCCCGCCTCG ATCCrCCCIT TATCCAGCCC TCACTCCTTC TCTAQGCGCC 
GGGGCGGAGC TAGGAGGGAA ATAGGiTCGGG AGTGAGGAAG AGATCCGCGG 

1351 GGCC6CFCTA GCCCATTAAT ACGACTCACT ATAGGQCGAT TCGAATCAGQ 
CCGSCGAGAT CGGCTAAtTA TGCTGAGTGA TATCCCGCTA AfiOTAGTCC 

1401 CCTTGGCGCG CCGGATCCTT AATTAAGCGC AATTGGGAGG TGGCGGTAGC 
GGAACCSCGC GGCaAGGAA TTAAnCGCG TTAACCCTCC ACCGCCATCG 

+2 MG VIT DSL AVVA RTD 

3 - — 

1451 CTCGAGATGG GC6TGATTAC GGATTCACTG GCCGTCGTGG CCCGCACCGA 
GAGCTCTACC CGCACTAATG CCTAAGTFGAC CGGCAGCACC GGGCGTGGCT 

+2 RPSQQLRSLNGEWRFA 



1501 TCGCCCTTCC CAACAGTTAC 6GAGCCTGAA TGGCGAATG6 CGCTTraCCr 
AGCGGGAAGG 6TTGTCAATS CGfTCGGACTT ACCGCTTACC GCGAAACGGA 

+2 W F P A PEA V P E S W L E COL 



1551 GGITFCCGGC ACCAGAAGCG 6TGCCGGAAA GCTGGCTGGA 6TGCGATCTT 
CCAAAGGCCG TGGTCTTCGC CACGGCCTTT CGACCGACCT CACGCTAGAA 

+2 PEAD TVV VPS NWQM HGY 



1601 CCTGAGGCCG ATACTGTCGT CGTCCCCTCA AACTGGCAGA TGCACGGITA 
GGACrCCGGC TATGACAGCA GCAGGGGAGT TTGACCGrnT ACGTGCCAAT 

+2 DAPIYTNVTYPITVNP 



1651 CGATGCGCCC ATCTACACCA ACGTGACCTA TCCCATTAC6 GTCAATCCGC 
GCTACGCQGG TAGATGrOGT TGCAaGGAT AGGGTAATGC CAGRAGGCG 

+2 PFVPTENPTGCYSLTFN 



1701 CGirreTTCC CACGGAGAAT CCGACQGGTT GTTACTC6CT CACATTTAAT 
GCAAACAAGG GTGCCTCTTA GGC7GCCCAA CAATGAGCGA GTGTAAATTA 
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pICAST ALC 21/78 

+2 VDESWLQEGQTRIIFDG 

1751 GTTGATGAAA GCTGGCTACA GGAAGGCCAG ACGCGAATTA TTTTreATGG 
CAACTACTTT CGACCGATCT CCITCCGGTC TGGGCTTAAT AAAAAQACC 

+2 VNSAFHL WCNGRWVGY 

1801 CGrTAACTCG GCGTTTCATC TGTGGTGCAA CGGGCGCTGG GTCGGTrACG 
GCAATTGAGC CGCAAASTAG ACACCACGnTT GCCCGCGACC CAGCCAATGC 

+2 GQDSRLPSEFD LSAFLR 

1851 GCCAGGACAG TCGTTTGCCG TCTGAATTTG ACCTGAGCGC ATmTACGC 
CGGTCCTGTC AGCAAACG6C AGACTTAAAC TGGACTCGCG TAAAAATGCG 

+2 AGEN RLA VMV LRWS DGS 

1901 GCCG6AGAAA ACCGCCTCGC GGTGATGGTG CTGCGCTGGA GTGACGGCAG 
CGGCCrCTTT TGGCGGAGCG CCACTACCAC GACGCGACCT CACTGCCGTC 

+2 YLE DQDM WRM SGI FRD 

1951 TTATCTGGAA GATGAGGATA TGTGGCGGAT GAGCGGCATT TTCCGTGACG 
AATAGACCTT CTAGTCCTAT ACACCGCCTA CTCGCCGTAA AAGGCACTGC 

+2 VSLLHKPTTQISDFHVA 

2001 TcrcGirrccT gcataaaccg actacacaaa tcagcgattt ccATsrrecc 

AGAGCAACGA CGTATTTGGC TBATffrGTTT AGTCGCTAAA GGTACAACGG 

+2 TRFNDDFSRAVLEAEVQ 

2051 ACTCGOTTA ATGATGATTT CAGCCGCGCT GTACTGGAGG CTBAAGTTCA 
T6AGCGAAAT TACTACTAAA GTCGGCGCGA CATGACQCC GACTTCAAGT 
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+2 MCGELRDYLRVTVSLW 

2101 GATSTGCQSC GAGmGGGTG ACTACCTACG GGTAACAGTT TCTTTATGGC 
CTACACGCCG CTCAACGCAC TGATGGATGC CCATTGTCAA AGAAATACC6 

+2 QGETQVASGTAPF6GEI 

2151 AGGSTGAAAC GCAGGTCGCC AGCGGCACCG CGCCTTTCGG C66TGAAATT 
TCCCACTTTG CGTCCAGCGG TCGCCffTGGC GCGGAAAGCC GCCACTTTAA 

+2 IDER GGY ADR VTLR LNV 

2201 ATCGATGAGC GTGGTGGTTA TGCCGATC6C GTCACACTAC GTCTGAACGT 
TAGCTACTCG CACCACCAAT ACGGCTAGCG CAGTGTGATG CAGACTTGCA 

+2 EMP KLWS AEI PNL YRA 

2251 CGAAAACCCG AAACTGTGGA GCGCCGAAAT CCCGAATCTC TATCGTGCGG 
GCmTGGGC TTTGACACCT CGCGGCTTTA GGGCTTAGAG ATAGCACGCC 

+2 VVEL HTA D6TL lEA EAC 

2301 TGGTTGAACT GCACACGGCC GACGGCACGC TGATTGAAGC AGAAGCCTGC 
ACCAACTTGA CGTGTCGCGG CTGCCGTGCG ACTAACTTCG TCTTCGGACG 

+2 DVGFREVRIENGLLLLN 

2351 GA7GTCGSTT TCCGCGAGGT GCGGATTGAA AATGGTCTGC TGCTGaGAA 
CTACAGCCAA AGGCGCTCCA CGCCTAACTT TTACCAGACG ACGACGACTT 

+2 GKPLLIRGVNRHEHHP 

2401 CGGCAAGCCG TTGCTGATTC GAGGCGTTAA CC6TCACGAG CATCATCCTC 
GCCGTTCGGC AACGACTAAG CTCCGCAATT GGCACTGCTC GTAGTAGGAG 
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+2 LHGQVMOEQTMVQDILL 

2451 TGCATGGTCA GGTCATGGAT GAGCAGACGA TGGTGCAGGA TATCCTGCTG 
ACGTACCAGT CCAGTACCTA CTCGTCTIGCT ACCACGTCCT ATAGGACGAC 

+2 MKQN NFN AVR CSHY PNH 

2501 ATGAAGCAGA ACAACTTTAA C6CGGTGCGC TSTTCGCATT ATCCGAACCA 
TACTTCGTCT TGTTGAAATT GCGGCACGCG ACAAGCGTAA TAGGCTTGGr 

+2 PLWYTLCDRYGLYVVD 

2551 TCCGCrGTGG TACACGCTGT GC6ACCGCTA CGGCCTGTAT GTGGTGGATG 
AG6CGACACC ATGTGCGACA CGCTGGCGAT GCCGGACATA CACCACCTAC 

+2 EAMI ETH GMVP MNR LTD 

2601 AAGCCAATAT TGAAACCCAC GGCATGGTGC CAATGAATCG TCTGACCGAT 
TTCQGTTATA ACTrrGGGIG CCGTACCACG GrTACTTAGC AGACTG6CTA 

+2 DPRW LPA MSE RVTR MVQ 

2651 GATCCGCGCr GQCTACCGGC GATGAGCGAA CGCGTAACGC GAATGGTCCA 
CTAQGCGCGA CCGATGGCC6 CTACTCGCTT GCGCATTGCG CTTACCACCT 

+2 RDRNHPSVIIWSLGNE 

2701 GCGCGATCGT AATCACCCGA GTGTCATCAT CTGGTCGCTG GGGAATGAAT 
CGCGCTAGCA TTAGTCGGCT CACACTAGTA GACCAGCGAC CCCTTACTTA 

+2 SGHGANHDALYRWIKSV 

2751 CAGGCCACGG CGCTAATCAC GACGCGCTGT ATCGOGGAT CAAATCTGTC 
6TCCGGTGCC GCGATTAGTG CTGCGCGACA TAGCGACCTA GITTAGACAG 
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+2 DPSRPVQYEG6GADTTA 

2801 GATCCTTCCC 6CCCGGT6CA GTATGAAGGC GGCGGAGCCG ACACCACGGC 
CTAGGAAGGG CGGGCCACGI CATACTTCGG CCGCCTCGGC TGTGGTGCCG 

+2 TDIICPMYARVDEDQP 

2851 CACCGATATT ATTT6CCCGA TGJACGCGGG CGTGGATGAA 6ACCA6CCCT 
GT6GCTATAA TAAACGGGGT ACAT6C6CGC GCACCTACTT CT6GTCGGGA 

+2 FPAVPKWSIKKWLSLPG 

2901 TCCCGGCTGT GCCGAAATGG TCCATCAAAA AATGGCTTTC GCTACCTGGA 
AGGGGCGACA CGGCTTTACC AGGTAGTTTT HACCGAAAG CGAT6GACCT 

+2 ETRPLILCEYAHAMGNS 

2951 GAGACGCGCC CGCTGATCCT TTGCGAATAC GCCCAC6CGA TGGGTAACAG 
CTCTGCGCGG GCGACTAGGA MCGCTTATG CGGGTGCGCT ACCCATTGTC 

+2 LGG FAKY WQA FRQ YPR 

3001 TCrTGGCGGT TTCGCTAAAT ACTGGCAGGC GTITCGTCAG TATCCCCGTT 
AGAACCGCCA AAGCGATTTA TGACCGTCC6 CAAAGCAGTC ATAGGG6CAA 

+2 LQG6 FVW DWVD QSL IKY 

3051 TACAGG6CGS CTTCGTCTGG GACTGGGTGG ATCAGTCGCT GATTAAATAT 
ATGTCCCGCC GAAGCAGACC CTGACCCACC TAGTCAGCGA CTAAnTATA 

+2 DENG NPW SAY G6DFGDT 

3101 GATGAAAACG GCAACCCGTG GTCGGCTTAC GGCGGrGATT TTGGC6ATAC 
CTACmTGC CGTTGGGCAC CAGCCGAATG CCGCCACTAA AACCGCTATG 
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+2 PNDRQFCMNGLVFADR 

3151 GCCGAACGAT CGCCAGTTCT CTATGAACGG TCTGGTCnT GCCGACCGCA 
CGQCTTGCTA GCGGTCAAGA CATACTTGCC AGACCAGAAA CGGCTGGCGT 

+2 TPHP ALT EAKH QQQ FFQ 

3201 CGCCGCATCC AGCGCT6ACG GAA6CAAAAC ACCA6CAGCA GnTTTCCAG 
6C6GCGTAG6 TCGCGACTGe CTTCGmTG TGGTGGTCGr CAAAAAGGTC 

+2 FRLSGQTIEVTSEYLFR 

3251 TTCCGTITAT CCGGGCAAAC CATGGAAGTG ACCAGCGAAT ACCTGTTCCG 
AAGGCAAATA GGCCCGITTG GTAGCTTCAC TGGTCGCTTA TGGACAAGGC 

+2 HSDMELLHWMVALD6K 

3301 TCATAGCGAT AACGAGCTCC TGCACTGGAT GGIGGCGCTG GATGGTAAGC 
ACTATCGCTA TTGCTCGAGG ACGTGACCTA CCACCGCGAC CTACCATTCG 

+2 PLAS GEV PLDV APQ GKQ 

3351 CGCTGGCAAG GGGTGAAGTC CCTCTIGGATG TCGCTCCACA AGGTAAACAG 
GCGACCGTTC 6CCACTTCAC GGAGACCTAC AGCGAGGTGT TCCA7TTGTC 

+2 LIELPELPQPESAGQLW 

3401 T7GATTGAAC TGCCTGAACT ACCGCAGCCG 6AGAGCGCCG GGCAACTCTG 
AACTAACTTG ACGGACTTGA TGGCGTCGGC CTCTCGCGGC CCSTTGAGAC 

+2 LTVRVVQPNATAWSEA 

3451 6CTCACAGTA CGCGTAGTGC AACCGAACGC GACCGCATGG TCA6AAGCCG 
CGAGTGTCAT GCGCATCACG TTGGCTTGCG CTGGCGTACC AGTCTTCGGC 
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+2 GHIS AMQ QWRL AEN LSV 

3501 GGCACATCAG CGCCTSGCAG CA6TGGCGTC T5GCGGAAAA CCTCAGTIGTG 
CCSTGTAGTC GCGGACCCTC GTCACCGCAG ACCGCCmT GGAGTCACAC 

+2 TLPAASHAIPHLTTSEM 

3551 ACGCTCCCCG CCGCGTCCCA CGCCATCCCG CATCT6ACCA CCAGCGAAAT 
TGCGAGGGGC GGCGCAGG6T GCGGTAGGGC GTAGACTGGT GGJCGCTTTA 

+2 DFCIELGNKRWQFNRQ 

3601 GGATTTTTGC ATCGAGCTGG GTAATAAGCG TrGGCAATTT AACCGCCAGT 
CCTAAAAACG TAGCTC6ACC CATTATTCGC AACCGTTAAA TrGGCGGTCA 

+2 SGFL SQM WIGD KKQ LLT 

3651 CAGGCTTTCT TTCACAGATG TGGATTQGCG ATAAAAAACA ACTGCTGAC6 
GTCCGAAAGA AAGTGTCTAC ACCTAACC6G TAIIIIIIGT TGACGACTGC 

+2 PLRD QFT RAP LDND IGV 

3701 CCGCTGCGCG ATCAGTTCAC CCGTGCACCG CTGGATAACG ACATTGGCGT 
, GGCGACGCGC TAGTCAAGTG GGCACGTGGC GACCTATTGC TGTAACCGCA 

+2 SEATRIDPNAWVERWK 

3751 AAGTCAAGCS ACCCGCATTS ACCCTAACGC CTCGGTCGAA CGCTGGAAGG 
TTCACTTCGC TGGGCGTAAC TGGGATTGCG GACCCAGCTT GCGACCTTCC 

+2 AA6HYQAEAALLQCTAD 

3801 CGGCGGGCCA TTACCAGGCC GAAGCAGCGT TGTTGCAGTG CACGGCAGAT 
GCCGCCCGGT AATGGTCCGG C7TCGTCGCA ACAACGTCAC GTCCCffTCTA 
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+2 TLADAVLITTAHAWQHQ 

3851 ACACTTGCTG ATGCGGTGCT GATTACGACC GCTCAGGCGT GGCAfiCATCA 
TGTGAACGAC TACGCCACGA CTAATGCTGG CGAGTGCGCA CCGTCGTAGT 

+2 GKTLFISRKTYRIDGS 

3901 GGGGAAAACC TTATTTATCA GCCGGAAAAC CTACCGGATT GATG6TAGTG 
CCCCTTTTGG AATAAATACT CGGCCTTTTG GATGGCCTAA CTACCA7CAC 

+2 GQMAITVDVEVASDTPH 

3951 GTCAAATGGC GATTACCGTT GATGTTGAAG TGGCGAGCGA TACACCGCAT 
CAGTTTACCG CTAATGGCAA aACAACTTC ACCGCTCGCT ATGTGGC6TA 

+2 PARIGLNCQLAQVAERV 

4001 CC6GCGCG6A TTGGCCTGAA CTGCCAGCTG GCGCAGGTAG CAGAGCGGGT 
GGCCGCGCCr AACCGGACTT GACGGTCGAC CGCGTCCATC GTCTCGCCCA 

+2 NWL GLGPQENYPDRLT 

4051 AAACTGGCTC GGATTAGGGC CGCAAGAAAA CTATCCCGAC C6CCTTACT6 
TTTGACCGAG CCTAATCCCG GCGTTCTnT GATAGGGCTG GCQGAATGAC 

+2 AACFDRWDLPLSDMYTP 

4101 CXGCCTGTTT TGACCGCTCG GATCTGCCAT TGTCA6ACAT GTATACCCCG 
GGCGGACAAA ACTGGCGACC QAGACGGTA ACAGTCTGrA CATATGGGGC 

+2 TVFPSENGLRCGTRELN 

4151 TACGTCTTCC C6A6CGAAAA CGGTCTGCGC TGCGGGACGC GCGAATTGAA 
AT6CAGAAGG GCTCGCTTTT GCCAGACGCG ACGCCCIGCG CGCTTAACTT 
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+2 YGPHQWRGDFQFNISR 



4201 TTATGGCCCA CACCAGTGGC 6CGGCGACTT CCAGTTCAAC ATCAGCC6CT 
AATACCGGGT GTCGTCACCG CGCCGCTGAA GGTCAAGTTG TAGTCGGCGA 

+2 YSQQ QLM ETSH RHL LHA 



4251 ACAGTCAACA GCAACTGATG GAAACCAGCC ATCGCCATCT GCTGCACGCG 
TGTCAGTTGT CGTTGACTAC CTTTGGTCGG TAGCGGTAGA CGACGTTGCGC 

+2 EEGTWLNIDGFHMGIGG 



4301 GAAGAAGGCA CATGGCTGAA TATCGAC6GT TTCCATATGG G6ATTGGTGG 
CTTCTTCCGT GTACCGACTT ATAGCTGGCA AAGGTATACC CCTAACCACC 

+2 DDSWSPSVSAEFQLSA 



4351 CGACGACTCC TGGAGCCCGT CAGTATCGGC GGAATTCCAG CTGAGCGCCG 
GCreCTGAGG ACCTCGGGCA 6TCATAGCCG CCITAAGGTC GACTCGCGGC 

+2 6RYH YQL VWCQ KRS DYK 



4401 GTC6CTACCA TTACCAGTTG GTCTGGTGTC AAAAAAGATC TGACTATAAA 
CAGCGATGGT AATGGTCAAC CAGACCACA6 TTTTTTCTAfi ACTGATATFT 

+2 DEDLDHHHHHHR 



4451 GATGAGGACC TCGACCATCA TCATCATCAT CACC6GTAAT AATAGGTAGA 
CTACTCCreG AGCTGGTAGT AGTAGTAGTA GfTBGCCATTA TTATCCATa 

4501 TAAGTGACTG ATTAGATGCA TTGATCCCTC GACCAAHCC GGTTATnTC 
ATTCACT6AC TAATCTACGT AACTAGGGAG CTGGrTAAGG CCAATAAAAG 

4551 CACCATATTG CCGTCTTTTG GCAATGTGAG GGCCCGGAAA CCTGGCCCTG 
GTGGTATAAC GGCAGAAAAC CGTTACACTC CCGGGCCTFT GGACCGGGAC 
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4601 TCTTCmGAC GAGCATTCCT AGGGGTCnT CCCCTCTCGC CAAAGGAATG 
A6AAGAACTG CTCGTAAGGA TCCCCAGAAA GGGGAGAGCG GTTTCCTTAC 

4651 CAAGGTCTCT TGAATCTCGT GAAGGAAGCA GTTCCTCreG AAGCITCTTS 
GTTCCAGACA ACTTACAGCA CITCCTTCGT CAAGGAGACC TTCGAAGAAC 

4701 AAGACAAACA ACGTCTGTAG CGACCCTTTG CAGGCAGCGG AACCCCCCAC 
TTCTGnTGT TGCAGACATC GCTGGGAAAC GTCCGTCGCC TTGGGGGGTG 

4751 CTGGCGACAG GTGCCTCreC GGCCAAAAGC CACGTGTATA AGATACACCT 
6ACCGCTGTC CACGGAGACG CCGGTTTTCG GTCCACATAT TCTATGTGGA 

4801 GCAAAGGCGG CACAACCCCA GT6CCACGTT GTGAGTTGGA TAGTTGT6GA 
C6T7TCCGCC GTGTTGGGGT CACGGTGCAA CACTCAACCT ATCAAGACCT 

4851 AAGAGTCAAA TGGCTCTCCT CAAGCGTATT CAACAAGGGG CTGAAGGATG 
TTCTCAGTrT ACCGAGAGGA GTTCGCATAA GTTGTTCCCC GACTTCCTAC 

4901 CCCAGAAGGT ACCCCATTGT ATGGGATCTG ATCTGGGGCC TCGGTGCACA 
GGGTCTTCCA TGGGGTAACA TACCCTAGAC TAGACCCCGG AGCCAC6TGT 

4951 TGCTTTACAT GTSTTTAGTC GAGGTTAAAA AACGTCTAGG CCCCCCGAAC 
ACGAAATGTA CACAAATCA6 CrCCAATTTT TTGCAGATCC GGGGSGCTTG 

5001 CACGGGGACG TGGmTCCT HGAAAAACA CGATGATAAT ACCATGATTG 

GTGCCCCTGC ACCAAAAGGA AACTTTTTCT GCTACTAnA TGGTACTAAC 

5051 AACAAGATGG ATTCCACGCA GGTTCTCCGG CCGCTTGG6T GGAGAGGCTA 
TTCTTCTACC TAACGTCCGT CCAAGAGGCC GGCGAACCCA CCTCTCCGAT 

5101 nCGGCTATG ACTGGGCACA ACAGACAATC GGCTGCTCTG ATGCCGCCGT 
AAGCCGATAC TGACCCGTGT TGTCTGTTAG CCGACGAGAC TACGGCGGCA 

5151 GTTCCGGCre TCAGCGCAGG GGCGCCCGGT TCI 1 1 1 iUIC AAGACCGACC 
CAAGGCCGAC AGTCGC6TCC CCGCGGGCCA AGAAAAACAG TTCTGGCTGG 
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5201 TGTaGGTGC CCTGMTGAA CTGCAGGACG AGGCAGCGCG GCTATCGTGG 
ACAGGCCACG GGAGTACTT GACGTCCTGC TCCGTCGCGC C6ATAGCACC 

5251 CTGGCCACGA CGGGCGTTCC TTGCGCAGCT GTTSCTCGACG TTGTCACTGA 
GACCGGiTGCT GCCCGGAAS6 AACGCGTCGA CACGAGCTGC AACAGTIGACT 

5301 A6CGGGAAGG GACTGGCTGC TATTGGGCGA AGTGGCGGGG CAGGATCTCC 
TCGGCCTTCC CTGACCGACG ATAACCC6CT TCACGGCCCC GTCCTAGAGG 

5351 TGTCATCTCA CCTTGCTCCT GCCGAGAAAG TATCCATCAT GGCTGATGCA 
ACAGTAGAGT GGAACGAGGA CGGCrCTITC ATAGGTAGTA CCGACTACGT 

5401 ATGCGGCG6C TGCATACGCT TGATCCGGCT ACCTGCCCAT TCGACCACCA 
TACGCCGCCG ACGTATGC6A ACTAGGCCGA TGGACGGGTA AGCTGGTGGT 

5451 A6CGAAACAT CGCATCGAGC GAGCACGTAC TCGGATGGAA GCCGGTCTTG 
TCGCnTGTA GCGTAGCTCG CTCGTGtkrG AGCCTACCTT CGGCCAGAAC 

5501 TCGATCAGGA TGATCTGGAC GAAGAGCATC AGGGGCTCGC GCCAGCCGAA 
AGCTAGTCCT ACTAGACCTG CTTCTCGTAG TCCCCGAGCG CGGTCGGCTT 

5551 CTSITCGCCA GGCTCAAGGC GCGCATGCCC GACGGCGAGG ATCTCGTCGir 
GACAAGCGGT CCGAGITCCG CGCGTACGGG CTGCCGCTCC TAGAGCAGCA 

5601 GACCCATGGC GATGCCTGCT TGCCGAATAT CATGGTGGAA AATGGCCGCT 
CTGGGTACCG CTACGGACGA ACGGCTTATA GTACCACCTT TTACCGGCG^ 

5651 nrCTBGAIT CATCGACTCT GGCCGGCTQG GTGTGGCGGA CCGCTATCAG 
AAAGACCTAA GTAGCTGACA CCGGCCGACC CACACCGCCT GGCGATAGTC 

5701 GACATAGCGT TGGCTACCCG TGATATTGCT GAAGAGCTTG GCGGCGAATG 
CTSTATCGCA ACCGATGGGC ACTATAACGA CTTCTCGAAC CGCCGCTTAC 

5751 QGCTGACCGC TTCCTCGTGC TTTACGGTAT CGCCGCTCCC GATTCGCAGC 
CCGACTGGCG AAGGAGCACG AAATGCCATA GCGGCGAGGG CTAAGCGICG 
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5801 GCATCGCCTT CTATCGCCTT CTTGACGAGT TCTTCTGAGC GG6ACTC71GG 
CGTAGCGGAA GATAGCSGAA GAACTGCTCA AGAAGACTCG CCCTGAGACC 

5851 GG^'CGCATC GATAAAATAA AAGATTTTAT TTAGTCTCCA GAAAAAGGQG 
CCAAGCCTAG CTATTTTATT TTCTAAAATA AATCAGAGGT CmTTCCCC 

5901 GGAATGAAAG ACCCCACCTG TAGGTTTGGC AAGCTAGCTT AAGTAACGCC 
CCTTACTnC TGGGGTGGAC ATCCAAACCG TTCGATCGAA TTGATTGCGG 

5951 ATTITSCAAS GCATBGAAAA ATACATAACT GAGAATAGAG AAGTTCAGAT 
TAAAACGTTC CSTACCmT TATGTATTGA CTCTTATCTC TTCAAGTCTA 

6001 CAAGGTCAGG AACAGATGGA ACA6CTGAAT ATGGGCCAAA GAG6ATATCT 
ffTTCCAGTCC TTGTCTACCT TGTCGACTTA TACCCGGTTT GTCCTATAGA 

6051 GTGGTAAGCA GrTTCCTGCCC CGGCTCAGGG CCAAGAACAG ATGGAACAGC 
CACCATTCGT CAAGGACGGG GCCGAGTCCC GGTrCTTGTC TACCTTGrCG 

6101 TGAATATGGG CCAAACAGGA TATCTGTGOT AAGCAGTTCC TGCCCCGGCT 
. ACTTATACCC GGTTTCTCCT ATAGACACCA TTCGTCAAGG ACGGGGCCGA 

6151 CAGGGCCAAG AACAGATGGT CCCCAGATGC GGTCCAGCCC TCAGCAGTTT 

GTCCCGffrrC ttgtctacca ggggtctacg ccaggtcggg agtcgtcaaa 

6201 ctagagaacc atcagatgtt tccagggtgc cccaaggacc tgaaatgacc 
gatctcttgg tagtctacaa aggtcccacg gggttcctgg actttactgg 

6251 ctbrecctta tttgaactaa ccaatcagtt cgcttctcgc ttctsttcgc 
gacacggaat AAAOTGATT ggttagtcaa gcgaagagcg aagacaagcg 

6301 GCGCTTCTGC TCCCCGAGa caataaaaga gcccacaacc cctcactcgg 
cgcgaagacg aggggctcga gttattttct cgggtgttgg ggagtgagcc 

6351 ggcgccagtc ctccgatrea ctgagtcgcc cgggtacccg tgtatccaat 
ccgcggtcag gaggctaact gactcagcgg gcccatgggc acataggtta 
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6401 AAACCCTCTT GCAGTTBCAT CCGACTTGT6 GTCTCGCTGT TCCTTGGGAG 
TTTGGGAGAA CGTCAACGTA GGCT6AACAC CAGAGCGACA AGGAACCCTC 

6451 GGTCTCCTCr GAGrGATTGA CTACCCGTCA GCGGGGGTCT TTCATTCAT6 
CCAGAGGAGA CTCACTAACT 6A7QGGCAGT CGCCCCCAGA AAGTAAGTAC 

6501 CAGCATGTAT CAAAATTAAT TTGGIIIIII 1TCTTAAGTA TTTACATTAA 
GTCGTACATA GnTTAATTA AACCAAAAAA AAGAATTCAT AAATGTAATT 

6551 ATGGCCATAG TTGCATTAAT GAATCGGCCA ACGCGCGGGG AGAGGCGGTT 
TACCGGTATC AACGTAATTA CTTAGCCGGT TGCGCGCCCC TCTCCGCCAA 

6601 TGCGTATTGG CGCTCTTCCG CTTCCTCGCT CACTGACTCG CTGCGCTCGG 
ACGCATAACC GCGAGAAGGC GAAGGAGCGA GTGACTGAGC GACGCGAGCC 

6651 TCGTTCGGCT GCGGCGAGCG GTATCAGCTC ACTCAAAGGC GGTAATACGG 
AGCAAGCCGA CGCCGCTCGC CATAGTCGAG TGA6TTTCCG CCATTATGCC 
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CTGCAGCCTG AATATGGGCC AAACAGGATA TCTGT6GTAA GCAGTTCCTG CCCCGGGTCA 60 

GACGTCGGAC TTATACCCGG TTTGTCCTAT AGACACCATT CGTCAAGGAC GGGGCCGAfiT 60 

GGGCCAAGAA CAGATGGAAC AGCFGAATAT GGGCCAAACA GGATATCTGT GGTAAGCAGT 120 

CCCGGIiCii GTCTACCrre TC6ACTTATA CCCGGTrTCT CCTATAGACA CCATTCGTCA 120 

TCCTGCCCCG GCTCAGGGCC AAGAACA6AT GGTCCCCAGA TGCG6TCCA6 CCCTCAGCAG 180 

AGGACGGGGC CGAGTCCC66 TTCTTGTCTA CCAGGGGTCT ACGCCAGGTC GGGAGTCGTC 180 

nrCTAGAGA ACCATCAGAT GTTTCCAGGG TGCCCCAAGG ACCTGAAATG ACCCT6T6CC 240 

AAAGATCTCT TGGTAGTCTA CAAAGGTCCC ACGGGGITCC TGGACTTTAC TGGGAGACGG 240 

TTATTTGAAC TAACCAATCA GTTCGCTTCT CGCTTCTGTT CGCGCGCTTC TGCTCCCCGA 300 

AATAAACTTG ATTGGTTAGT CAAGCGAAGA GCGAAGACAA GCGCGCGAAG ACGAGGGGQ 300 

GCTCAATAAA AGAGCCCACA ACCCGTCACT CGGGGCGCCA GTCCTCCGAT TGACTGAGTC 360 

CGAGTTAT1T TCTCGGGTGT TGGGGAGTGA GCCCCGCGGT CAGGAGGCTA ACTGAaCAG 360 

GCCCGGGTAC CCGTGTATCC AATAAACCCT CTTGCAGTTG CATCCGACTT GTGGTCTCGC 420 

CGGGCCCATG GGCACATA66 TTATTTGGGA GAACGTCAAC GTAGGCTGAA CACCAGA6CG 420 

TGTTCCTTGG GAGGGTCTCC TCTGAGTGAT TGACTACCCG TCAGCGGGGG TCnTCATTT 480 

ACAAGGAACC CTCCCAGAGG AGACTeACTA ACTGATQGGC AGTCGCCCCC AGAAACTAAA 480 

GGGGGCTCGT CCGGGATCGG GAGACCCCTG CCCAGGGACC ACCGACCCAC CACCGGGAGG 540 

CCCCCGAGCA GGCCCTAGCC CTCTGGGGAC GGGTCCCTGG TGGCTGGGTG GTGGCCCTCC 540 

CAAGCTGGCC AGCAACTtAT CTSTGTCTGT CCGATTGTCT AGT6TCTATG ACTGATTTTA 600 

GfTTCGACCGG TCGTTGAATA GACACAGACA GGCTAACAGA TCACAGATAC TGAQAAAAT 600 

TGCGCCTGCG TCGGTACTAG TTAGCTAACT A6CTCTGTAT CTGGCGGACC CGTGGTGGAA 660 

ACGCGGAC6C A6CCATGATC AATC6A7TGA TCGAGACATA GACCGCCTG6 GCACCACCTT 660 

CTGACGAGTT CTGAACACCC 6GCCGCAACC CTGGGAGACG TCCCAGGGAC TTTGGG6GCC 720 

GACTGCrCAA GACTTGTGGG CCGGCGTTGG GACCCTCT6C AGGGTCCCTG AAACCCCCGG 720 

GIIIIIGTGG CCCGACCTGA GGAAGGGAGT CGATGTGGAA TCCGACCCCG TCAGGATATG 780 

CAAAAACACC GGGCTGGACT CCTTCCCTCA GCTACACCTT AGGCTGGGGC AGTCCTATAC 780 
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TGGTTCTGGT AGGAGACGAG AACCTAAAAC AGTTCCCGCC TCCGTCTGAA TTTTTGCnT 840 

ACCAAGACCA TCCTCTGCTC TTGGATnTG TCAAGGGCGG AGGCAGACTT AAAAACGAAA 840 

CGGirrcGAA CCGAAGCCGC GCGTCTTGTC TGCTGCAGCA TCGTTCTSTG TTGTCTCTCr 900 

GCCAAACCTT GGCTTCGGCG CGCAGAACAG ACGACGTCGT AGCAAGACAC AACAGAGACA 900 

CTGACTGTGT TTCTGTATrT GTCTGAAAAT TAGGGCCAGA CTGnACCAC TCCCTTAAGT 960 

GACTGACACA AAGAGATAAA CAGACmTA ATCCCGGTCT GACAATGGTG AGGGAATTCA 960 

TTGACCTTAG GTAACTGGAA AGATGTCGAG CGGCTCGCTC ACAACCAGTC GGTAGATGrC 1020 

AACTGGAATC CATTGACCTT TCTACAGCTC GCCGAGCGAG TGTTGGTCAG CCATCTACAG 1020 

AAGAAGAGAC GTTGGGTTAC CTTCTGCTCr GCAGAATGGC CAACCTTTAA CGTCGGATGG 1080 

TTCTTCTCTG CAACCCAATG GAAGACGAGA CGTCTTACCG GTTGGAAATr 6CAGCCTACC 1080 

CCGCGAGACG GCACCTITAA CCGAGACCTC ATCACCCAGG TTAAGATCAA GGTCTTTTCA 1140 

GGCGCrCTGC CGTGGAAATT GGCTCTGGAG TAGTGGGTCC AATTCTAGTr CCAGAAAAGT 1140 

CCTGGCCCGC ATGGACACCC AGACCAGGTC CCCTACATCG TGACCTGGGA AGCCTTGGCT 1200 

GGACCGGGCG TACCTGTGGG TCTEGTCCAG GGGATGTAGC ACTGGACCCT TCGGAAGCGA 1200 

TTTGACCCCC CTCCCTBGGT CAAGCCCTTT GTACACCCTA AGCCTCCGCC TCCTCTTCCT 1260 

AAACTGGGGG GAGGGACCCA GTTCGGGAAA CATGTGGGAT TCGGAGGCGG AGGAGAAGGA 1260 

CCATCCGCCC CGTCrCTCCC CCTTGAACCT CCTCGTTC6A CCCCGCCTCG ATCCTCCCTr 1320 

GGTAGGCGGG GCAGAGAG6G GGAAOTGGA GGAGCAAGCT GGGGCGGAGC TAGGAGGGAA 1320 

TATCCAGCCC TCACTCCTTC TCTAGGCGCC GGCCGCTCTA GCCCATTAAT ACGACTCACT 1380 

ATAGGTCGGG AGTGAGGAAG AGATCCGCGG CCGGCGAGAT CGGGTAATTA TGCTGAGTGA 1380 

ATAGGGCGAT TCGAACACCA TGCACCATCA TCATCATCAC GTCGACTATA AAGATGAGGA 1440 

TATCCCGCTA AGCTTGTGGT ACGrTGGTAGT AGTAGTAGTG CA6CT6ATAT TTCTACTCCT 1440 

CCTCGA6ATG GGCGTGATTA CGGATTCACT GGCCGTC6TG GCCCGCACCG ATCGCCCTTC 1500 

GGAGCTCTAC CCGCACTAAT GCCTAAGTGA CCGGCAGCAC CGGGCGTGGC TAGCGGGAAG 1500 

CCAACAGHA CGCA6CCTGA ATGGCGAATG GCGCTTTGCC TGGTTTCCGG CACCAGAAGC 1560 

GGTTGTCAAT GCGTCGGACT TACCGCTTAC CGCGAAACGG ACCAAAGGCC GTGGTCTTCG 1560 
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GGTCCCGGAA AGCTGGCTGG ASTGCGATCT TCCTGA6GCC GATACTGTCG TCGTCCCCTC 1620 

CCACGGCCTT TCGACCGACC TCACGCTAGA AGGACTCCGG CTAT6ACA6C AGCAGG6GAG 1620, 

AAACTGGCAG ATGCACGGTT ACGATGCGCC CATCTACACC AACGTGACCT ATCCCATTAC 1680 

TTTGACCCTC TACGTBCCAA TGCTACGCGG CTAGATGTCG TTGCACTGGA TAGGGTAATG 1680 

GGTCAATCCG CCGnTGTTC CCACG6AGAA TCCGACGGGT TGTTACTCGC TCACATTTAA 1740 

CCAGHAGGC GGCAAACAAG GGTGCCTCTr AGGCT6CCCA ACAATGAGCG AG7GTAAATT 1740 

TCTTGATGAA AGCTSGCTAC AGGAAGGCCA GACGCGAATT ATTTTTGATG GCGTTAACTC 1800 

ACAACTACTT TCGACCGATG TCCTTCCGGT CTGCGCTTAA TAAAAACTAC CGCAATTGAG 1800 

GGCGTTrCAT CTGTIGGTGCA AG6GGCGCTG GGTCGGTTAC GGCCAGGACA GTCGTTTGCC 1860 

CCGCAAAGTA GACACCACGT TGCCCGCGAC CCAGCCAATG CCGGTCCTGT CAGCAAACGG 1860 

GTCTGAATTT GACCTGAGCG CATTTTTACG CGCCGGAGAA AACCGCCTCG CGGTGATGGT 1920 

CAGACTTAAA GTGGACTCGC GTAAAAATGC GCGGCCTCTT HGGCGGAGC GCCACTACCA 1920 

GaGGGCTGG AGTGACGGCA GTTATCTGGA AGATGAGGAT ATGTGGCGGA TGAGCGGCAT 1980 

CGACGCGAGC TCACTGCCGT CAATAGACCT TCTAGTCCTA TACACCGCCT ACTCGCCGTA 1980 

TTTCCCTGAC GTCTCGTTGC TGCATAAACC GACTACACAA ATCAGCGATT TCCATGTTGC 2040 

AAAGGCAaG CAGAGCAACG ACGTATTTGG CTGATGTGTT TAGTCGCTAA AGGTACAACG 2040 

GAQCGCnT AATGATGATT RCAGCCGCGC TGTACTGGAG GCTGAAGTTC AGATGTGCGG 2100 

GT6A6CGAAA TTACTACTAA AGTC6GCGCG ACATGACCTC CGACTTCAAG TCTACACGCC 2100 

CGASnBCGT GACTACCTAC GGGTAACAGT TTCTTTATGG CAGGGTGAAA CGCAGGTCGC 2160 

GCTCAACGCA CTGATGGATG CCCATTGTCA AAGAAATACC GTCCCACTTT GCGTCCAGCG 2160 

CAGCGGCACC GCGCCTTTCG GCGGTGAAAT TATC6ATGAG CGTGGTGGn" ATGCCGATCG 2220 

GTCGCC6TG6 CGCGGAAAGC CGCCACTTTA ATAGCTACTC GCACCACCAA TACG6CTAGC 2220 

CGTCACAQA CGTCTGAACG TCGAAAACCC GAAACIGTCG AGCGCCGAAA TCCCGAATCT 2280 

GCAGFGTGAT 6CAGACTTGC AGCnTTGGG CITTGACACC TCGCGGCTTT AGGGCTTAGA 2280 

CTATCGTGCG GTGffrTGAAC TGCACACCGC CGACGGCACG CTGATTGAAG CA6AAGCCTG 2340 

GATAGCACGC CACCAACITG ACGTGTGGCG GCTGCCGTGC GACTAACTTC GTCTTCGGAC 2340 
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CGATGTCGGT TTCCGCGAGG TGCGGATTSA AMTGGTCTG CTGCTGCT6A ACGGCAAGCC 2400 

6CTACA6CCA AAGGCGCTCC AC6CCTAACT TTTACCAGAC GAC6AC6ACT TGCCGTTCGG 2400 

GTTGCTGATT CGAGGCGTTA ACCGTCAC6A GCATCATCCT CreCATBSTC AGGTGATGGA 2460 

CAACGACTAA GCTCCSCAAT TQGCAGTCCT CGT^ 2460 

TGAGCAGACG ATGGTGCAGG ATATCCTGCT GATGAAGCAG AACAACTTTA AC6CCGTGCG 2520 

ACTCGTCTGC TACCACGTCC TATAGGACGA CTACTrGGTC TTGTTGAAAT TGCGGCAC6C 2520 

CrerrCGCAT TATCCGAACC ATCCGCTGTG GTACACGCTG TGCGACCGCT ACGGCCTGTA 2580 

GAGAAGCGTA ATAGGCTTB6 TAGGCGACAC CATGTGCGAC ACGCTGGCGA TGCCGGACAT 2580 

TGTGGTGGAT GAAGCCAATA TTGAAACCCA CGGCATGGTG CCAATGAATC GTaGACCGA 2640 

ACACCACCTA CTTCGG1TAT AACTTTGGGr 6CCGTACCAC GGTTACTTAG CAGACTGGCT 2640 

TGATCCGCGC 7GGCTACCGG CGATGAGCGA ACGCGTAACG CGAATGGTGC AGCGCGATCG 2700 

ACTAGGCGCG ACCGATGGCC GCTACTCGCT TGCGCATTGC 6CTTACCACG TC6CGCTAGC 2700 

TAATCACCCG AGTGTGATCA TCTGffrCGCT GGGGAATGAA TCAGGCCAC6 GCGCTAATCA 2760 

ATTAGTGGGC TCACACTAGT AGACCAGCGA eCCCTTACTT AGTCCGGTGC CGCGATTAGT 2760 

GGACGCGCTG TATCGCTEGA TCAAATCTGT CGATCCTTCC CGCCCGGTGC AGTATGAAGG 2820 

GCTGCGCGAC ATAGCGACCT AfflTTAGACA GCTAGGAAGG GCGGGCCACG TCATACTTCC 2820 

CGGCGGAGCC GACACCACGG CCACCGATAT TATTTGCCCG ATGTACGCGC GCGTGGATGA 2880 

GCCGCCTCGG CTGTGGTGCC GGTGGCTATA ATAAACGGGC TACATGCGCG CGCACCTACT 2880 

AGACCAGCCC TTCCCQGCTG TGCCGAAATG STCCATCAAA AAATGGCnT CGCTACCTGG 2940 

TCTGGTCGG6 AAGGGCCGAC ACGGCnTAC CAGGTAGTTT TTTACCGAAA GCGATGGACC 2940 

AGAGACGCGC CCGCTGATCC TTTGCGAATA CGCCCACGCG ATGGGTAACA GTCTTGGCGG 3000 

TCTCTGCGCG GGCGACTAGG AAAC6CTTAT GCGGGT6CGC TACCCATTGT CAGAACCGCC 3000 

TTTCGCTAAA TACTGGCAGG CSnTCCTCA GTATCCCCGT TTACAGGGCG GCTTCGTCTG 3060 

AAAGCGATTT ATSACCGTCC GCAAAGCAGT CATAGGGGCA AATGTCCCGC CGAAGCAGAC 3060 

GGACTGGGTG GATCAGTCGC TGATTAAATA TGATGAAAAC GGCAACCCGT GGTCGGCTTA 3120 

CCTGACCCAC CTAGTCA6CG ACTAATTTAT ACTACTT7TG CCGTTGGGCA CCAGCCGAAT 3120 
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CG6CGGTGAT TTTGGC6ATA CGCCGAACGA TC6CCAGTTC TGTAT6AACG GTCTGGTCTT 3180 

6CC6CCACTA AAACCGCTAT GCGGCTTGCT AGC6GTCAAG ACATACTTGC CAGACCA6AA 3180 

TGCCGACCGC ACGCCGCATC CAGC6CTGAC G6AAGCAAAA CACCAGCA6C AGmTTCCA 3240 

ACGGCTGGCG TGCGGGSTAG GTCGCGACTG CCTTCGmT GTGGTCGTCG TCAAAAAGGT 3240 

GrrCCGlTTA TCCGGGCAAA CCATC6AAGT GACCAGCGAA TACCTGTTCC GTCATA6C6A 3300 

CAAGGCAAAT AGGCCCGnT GCTAGCTTCA CrGGTCGCTT ATGGACAAGG CAGTATCGa 3300 

TAAC6AGCTC CTGCACTGGA TGGTGGCGCT GGATGGTAAG CCGCTGGCAA GCGGTGAAGT 3360 

ATTGCTCGAG GACGTGACCT ACCACC6CGA CCTACCATTC G6CGACCGTT CGCCACTTCA 3360 

GCCTCTGGAT GTCGCTCCAC AAGGTAAACA GTTGATTGAA aGCCTGAAC TACCGCAGCC 3420 

CGGAGACCTA CAGCGAGGT6 TTCCAnrGT CAACTAACTF GACGGACTTG ATGGCGTCGG 3420 

GGAGAGCGCC GGGCAACTCT GGCTCACAGT ACGCGTAGTG CAACCGAACG CGACCGCATG 3480 

CCTCTCGCGG CCCGTTGAGA CCGAGTGTCA TGCGCATCAC GTTGGCTrGC GCTGGCGTAC 3480 

GTCAGAAGCC GGGCACATCA GCGCCTGGCA GCAGTGGCGT CTGGCGGAAA ACCTCAGTGT 3540 

CAGTCTTCGG CCCGTGTAGT CGCGGACCGT CGTCACCGCA GACC6CCTTT TGGAGTCACA 3540 

GACGCrCCCC GCCGCGTCCC ACGCCATCCC GCATCTGACC ACCAGCGAAA IGGATTTTTG 3600 

CTGCGAGGGG CGGCGCAGGG TGCGGTAGG6 CGTAGACTGG TGGTCGCTTT ACCTAAAAAC 3600 

CATCGAGCTG GGTAATAAGC GTTGGCAATT TAACCGCCAG TCAGGCTTTC TTTCACAGAT 3660 

GTAGCTCGAC CCATTATTCG CAACCGTTAA ATTGGCGGTC AGTCCGAAAG AAAGTGTCTA 3660 

GTGGATTGGC GATAAAAAAC AACTGCTGAC GCCGCTGCGC GATCAG7TCA CCCGTGCACC 3720 

CACCTAACCG CT AIIIillG TTGACGAQG CGGCGAC6CG CTAGTCAAGT GGGCACGT6G 3720 

GCTGGATAAC GACATTBGCG TAAGTGAAGC GACCCGCATT GACCCTAACG CCTG6GTCGA 3780 

CGACCTAnG CTCTAACCGC ATTCACTTCG aGGGCGTAA CTGGGA1TGC GGACCCAGCT 3780 

ACGCTGGAAG GCGGCGGGCC ATTACCAGGC CGAAGCAGCG TTGTrGCAGT GCACGGCAGA 3840 

TGCGACCTTC CGCCGCCCGG TAATGGTCCG GCTTCGTCGC AACAACGTCA CGTGCCGTCT 3840 

TACACTTGCT GATGCGGTGC TGATTACGAC CGCTCACGCG TGGCAGCATC AGGGGAAAAC 3900 

ATGTGAACGA CTACGCCACG ACTAATGaG GCGAGTGCGC ACCGTCGTAG TCCCCTTTTG 3900 
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CTTATTTATC AGCCGGAAAA CCTACCGGAT T6ATGGTAGT GGTCAAATGG CGATTACCGT 3960 
GAATAAATAG TCGGCCrTTT GGATGGCCTA ACTACCATCA CCAGTTTACC 6CTAATGGCA 3960 

TGATGirraAA GTGGCGAGCG ATACACCGCA TCCGGCGCGG ATTGGCCTGA ACTGCCAGCT 4020 
ACTACAACTT CACCGCTCGC TATGrSGCGT AGGCG6CGCC TAACCGGACT T6ACGGTC6A 4020 

G6CGCAGGTA 6CAGAGCGGG TAAACreGCT CGGATTAGGG CCGCAAGAAA ACTATCCC6A 4080 
eCGCGTCCAT CGTCTCGCCC ATTTGAGC6A GCCTAATGGC GGG GIICIII TGATAGGGCT 4080 

GCGCCTTACT GCCGCCTCTT TTGAGCGCTG GGATCTGGCA TrETCAGACA TGTATACCCG 4140 
6GGGGAATGA GGGCGGACAA AACTGGGGAC GCTAGAGG6T AAGAGTCTGT ACATATGGGG 4140 : 

GTACGTCTTC CCGAGCGAAA ACGGTCTGCG CTGCGGGACG CGCGAATTGA ATTATGGCCC 4200 
CATGCA6AA6 6GCTCGCTTT TGCCAGACGC GACGCCGTGC GCGCTTAACT TAATACGGGG 4200 

ACACGAGTGG GGCGGCGACT TGGAGTTCAA GATCAGCGGC TACAGTICAAC AGCAAGTGAT 4260 

TGTGGTGACC GGGGG6GTGA AGGTCAAGTT GTAGTGGGGG ATGTCAG7TG TGGTTGACTA 4260 

GGAAACGAGC CATCGGCATC TGCTGCACGC GGAAGAAGGC ACATGGCTGA ATATCGACGG 4320 

GCTTTGGTCG GTAGGGGTAG AGGAGGTGGG CCTTGnGGG TGTAGGGACT TATA6CTGCC 4320 

"FirGGATATG GGGAlTSGre GGGACGACTC CTGGAGCGGG TCAGTATCGG CGGAATTCCA 4380 

AAAGGTATAC GGCTAACGAC CGCTGCTGAG GACCTGGGGG AGIGATAGCC GGCTTAAGGT 4380 

GCTGAGCGCG GGTGGCTACC ATTACCAGTT GGTGTGGTGT CAAAAAAGAT CTGGAGGTGG 4440 

CGACTGGGGG GGAGCGATGG TAATG6TGAA GGAGACCAGA GllilllGTA GACQGGAGG 4440 

TSGCAGCAGG GCTTGGCGGG GCGGATCCTT AATTAACAAT TGACCGGTAA TAATAGGTAG 4500 

ACCGTC6TCC GGAACCGCGC GGGCTAGGAA TTAATrGTTA ACFGGGCATT ATTATCCATC 4500 

ATAAGTGACT GATTAGATGG ATTGATGGCT GGACGAATTG GGGTTATnT CGAGGATATT 4560 

TATTCACTGA CTAATCTAGG TAACTAGGGA GCrGGHAAG GGGAATAAAA GGTGGTATAA 4560 

GCCGTCnrr GGCAATGTGA GGGGCGGGAA AGCreGCGCr GTCTTCTTGA GGAGGATTCC 4620 

CGGGAGAAAA CCGTTACAGT CCGGGGCCTT TGGAGCGGGA CAGAAGAACT GCTGGTAAGG 4620 

TAGGGGTCTT TCCGGTCTCG CCAAA6GAAT GCAAGGTCTG TTGAATGTCG TGAAGGAA6C 4680 

ATGGCGAGAA AGGGGAGAGG GGTrTGCTTA GGTTGCAGAC AACTTACAGC ACnGGTTGG 4680 
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AGTTCaCTG GAAGOTCTT GAAGACAAAC AACGTCTGTA GCGACCCTTT GCAGGCAGCG 4740 
TCAAGGAGAC CTTCGAAGAA CTTCTGnTG TTGCAGACAT C6CTG6GAAA CGTCCGTGGC 4740 

GAACCCCCCA CCTGGCGACA GGTCCCTCTG CGGCCAAAAG CCACGTGTAT AAGATACACC 4800 
CTT6QGGGGT GGACCGCTGT CCACGGAGAC GCCGGTnTC GGTGCACATA TTCTATCTGG 4800 

TGCAAAGGCG GCACAACCCC AGTGCCACGT TGTGAGTTGG ATAGTTGTGG AAAGAGTCAA 4860 

ACGTTTCCGC CGTGTTGGGG TCACGGTGCA ACACTCAACC TATCAACACC TTTCTCAGTT 4860 

ATGGCTCTCC TCAAGCCTAT TCAACAAGGG GCTGAAGGAT 6CCCAGAAGG TACCCCATTG 4920 

TAGC6AGAGG AGTTCGCAtA AGnGTTCCC C6ACTTCCTA CGGGTCTTCC ATGG6GTAAC 4920 

TATGGGATCT GATCTGGGGC CTCGGT6GAC ATGGTTTACA TGTGTTTAGT CGAGGTTAAA 4980 

ATACCCTAGA CTAGACCCCG GAGCCACGTG TACGAAATGT ACACAAATCA- GCTCCAATTT 4980 

AAACCTCTAG GCCCCCC6AA CCAC6GGGAC GTGGTnTCC TITGAAAAAC ACGATGATAA 5040 

TTTGCAGATC CGGGGGGCTT GGTGCCCCTG CACCAAAAGG AAACTnTTG TGCTACTATT 5040 . 

TACCATGATT GAACAAGATG GATTGCACGG AGGTTCTCCG GCCGCTTGGG TGGAGAGGCT 5100 

ATGCTACTAA CTreTTCTAC CTAACGTGCG TCCAAGAGGC CGGCGAACCC ACCTCTCCGA 5100 

ATTCGGCTAT GACTGGGCAC AACAGACAAT GGGCTGCTCT GATGCC6CCG TGTTCCGGCT 5160 

TAAGCCGATA CTGACCCGTG TTGTC7CTTA GCCGACGA6A CTACGGCGGC ACAAGGCCGA 5160 

GTCAGCGCAG GGGCGCCCGG TTCIII IIGT CAAGACCGAC CTGTCCGGTG CCQGAATGA 5220 

CAGTCGCGTC CCCGCGGGCC AAGAAAAACA GITCTGGCTG GACAGGCaC GGGAOTACT ,5220 

ACTGCAGGAC GAGGCAGCGC GGCTAtCGTG GCTGGCCAC6 ACGGGCGTTC CTTGCGCAGC 5280 

TGACGTCCTG CTCCGTCGC6 CCGATAGCAC CGACCGGTGC TGCCCGCAAG GAACGCGTCG 5280 

TGTGCTCGAC GTTGTCACTG AAGCGGGAAG GGACTGGCTG CTATTGGGCG AAGTGCCGGG 5340 

ACACGAGCTG CAACAGTGAC TTCGCCCTTC CCTGACCGAC GATAACCCGC TTCACGGCCC 5340 

6CA6GATCTC CTGTCATCTC ACOTGCTCC TGCCGAGAAA GTATCCATCA T66CTGAT6C 5400 

CGTCCTA6AG GACAGTAGAG TBGAACGAGG ACGGCTCTTT CATAGGTAGT ACCGACTACG 5400 

AATGCGGCGG CTGCATACGC TTGATCCGGC TACCTGCCCA TTCGACCACC AAGCGAAACA 5460 

TTACGCCGCC GACGTATGCG AACTAGGCCG ATGGACGGGT AAGCTGGTGG TrCGCTrTGT 5460 
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TCGCATCGAG CGAGCACGTA CTCGGATGGA AGCCGGTCTT GTC6ATCAGG ATGATCTGGA 5520 

AGCGTAGCTC GCTCGTGCAT GA6CCTACCT TCGGCCAGAA CAGCTAGTCC TACTAGACCT 5520 

CGAAGAGCAT CAGGGGCTC6 C6CCAGCCGA ACTGTrCGCC AGGCTCAAGG CGCGCATGCC 5580 

GCTTCrCGTA GTCCCCGAGC GCGGTCGGCT TGACAAGCGG TCCGAGTTCC GCGCGTACGG 5580 

CGACGGCGAG GATCTCGTCG TGACCCATGG CGATGCCTGC TT6CCGAATA TCATGGTG6A 5640 

GCTGCCGCTC CTAGAGCAGC ACTGGGTrACC GCTACGGACG AACGGCTTAT AGTACCACCT 5640 

AAATGGCCGC TnTCTGGAT TCATCGACTG TGGCCGGCTG GGTGTGGCGG ACCGCTATCA 5700 

TITACCGGCG AAAAGACCTA AGTAGCTGAC ACCGGCCGAC CCACACC6CC TGGC6ATAGT 5700 

GGACATAGCG TTGGCTACCC GTGATATTGC TGAAGAGCTT 6GCGGCGAAT GGGCTGACCG 5760 

GCTGTATCGC AACCGATGGG CACTATAACG ACTTCTCGAA CCGCCGCTTA CCCGACTGGC 5760 

CTTCCTCGTG CrTTACGGTA TCGCCGCTCC CGATTCGCAG CGCATCGCCT TCTATCGCCT 5820 

GAAGGAGCAC GAAATGCCAT AGCGGCGAGG GCTAAGCGTC GCGTAGCGGA AGATAGCGGA 5820 

TCTTGACGAG TTCTTCTGAG GGGGACTCTG GGGTTCGCAT CGATAAAATA AAAGATTrTA 5880 

AGAACTGCTC AAGAAGACTC GCCCTGAGAC CCCAAGCGTA GCTATTTTAT TTTCTAAAAT 5880 

TTTAGTCTCC AGAAAAAGGG GGGAATGAAA GACCCCACCT GTAGGTTrGG CAAGCTA6CT 5940 

AAATCAGAGG TCillilCCC CCCTTACnT CTGGGGTGGA CATCCAAACC GTTCGATCGA 5940 

TAAGTAACGC CATTTTGCAA GGCATGGAAA AATACATAAC TGAGAATAGA GAAGTTCAGA 6000 

ATTCATTGCG GTAAAACGTT CCGTACCnT TTATGTATTG ACTClTATCr CTTCAAGTCT 6000 

TCAAGGT'CAG GAACAGAT6G AACAGCTGAA TATGGGCCAA ACAGGATATC TGTGGTAAGC 6060 

AGTTCCAGTC CTTGTCTACC TTCTCGACTT ATACCCGGTT TGTCCTATAG ACACCATTCG 6060 

AGTTCCTGCC CCGGCTCAGG GCCAAGAACA GATGGAACAG CTGAATATGG GCCAAACAGG 6120 

TCAAGGACGG GGCCGAGTCC CGGTTCTTGT CTACCTTGTC GACTTATACC CGGTTTGTCC 6120 

ATATCreiGG TAAGCAGTTC CTGCCCCGGC TCAGGGCCAA GAACAGATGG TCCCCAGATG 6180 

TATAGACACC AnCGTCAAG GACGGGGCC6 AGTCCCGGTT CTreTCTACC AGGGGTCTAC 6180 

CGGTCCAGCC CTCAGCAGn TCTA6AGAAC CATCAGATGT TTCCAGGGTG CCCCAAGGAC 6240 

GCCAGGTCGG GAGTCGTCAA AGATCTCTTG GTAGTCTACA AAGGTCCCAC GGGGTTCCTG 6240 
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CT6AAAT6AC CCrGTGCCTT AnTGAACTA ACCAATCAGT TCGCTTCTCG CTTCTGITCG 6300 

GACnTACTG GGACACGGAA TAAACTTGAT TGGTTAGTCA A6CGAAGAGC GAAGACAAGC 6300 

CGCGCTTCTG CTCCCCGAGC TCAATAAAAG AGCCCACAAC CCCTCACTCG GGGCGCCAGT 6360 

6CGCGAAGAC GAGGGGCTCG AGrrATTTTC TCGGGrGTTG GGGAGTGA6C CCCGCGGTCA 6360 

CCTCCGATTG ACTGAGTC6C CCGGGTACCC GJGTATCCAA TAAACCCTCT TGCAGTTGCA 6420 

GGAG6CTAAC TGACTCAGCG GGCCCATGG6 CACATAGGTT ATTTGGGAGA ACGTCAACGT 6420 

TCCGACTTGT GCTaCGCTG TTCCTreGGA GQGTCTCCTC TGAGTGATTG ACTACCCGTC 6480 

AGGCTGAACA CCAGAGCGAC AAGGAACCCT CCCAGAGGAG ACTCACTAAC TGATGGGCAG 6480 

AGCGGGGGTC TTTCATTCAT GCAGCATGTA TCAAAATTAA TTTGGI 1 1 1 1 TTTCTTAAGT 6540 

TCGCCCCCAG AAAGTAAGTA CGTCGTACAT A6TTTTAATT AAACCAAAAA AAAGAATTCA 6540 

ATTTACATTA AATGGCCATA GnGCATTAA TGAATCGGCC AACGCGCGGG GAGAGGCGGT 6600 

TAAATGTAAT TTACCGGTAT CAACGTAATT ACTTAGCCGG TTGCGCGCCC CTCTCCGCCA 6600 

AACGCATAAC CGCGAGAAGG CGAAGGAGGG AGTGACTGAG CGAGGCGAGC CAGCAA6CCG 6660 

TTGCGTAITG GCGCTCTTCC GCTTCaCGC TCAaGACTC GCTGCGCTCG GTCGTTCGGC 6660 

TGGGGCGAGC GGTATCAGCT CAQCAAAGG CGGTAATACG GTTATCCACA GAATCAGGGG 6720 

ACGCCGCTCG CCATAGTCGA GTGAGTTTCC GCCATTATGC CAATAGGTGT CTTAGTCCCC 6720 

ATAACGCAGG AAAGAACATG TGAGCAAAAG GCCAGCAAAA GGCCAGGAAC CGTAAAAAGG 6780 

TATTGCGTCC nTCTTGTAC ACTCGITTTC CGGTCGTTTT CCGGTCCTTG GCATTTTTCC 6780 

CCGCGTTGCT GGCGTTTTTC CATAGGCTCC GCCCCCCTGA CGAGCATCAC AAAAATCGAC 6840 

GGCGCAACGA CCGCAAAAAG GTATCCGAGG CGGGGGGACT GCTCGTAGTG TTTTTAGaG 6840 

GCrCAAGTCA GAGGTGGCGA AACCCGACAG GACTATAAAG ATACCAGGCG TTTCCCCCTG 6900 

CGAGTTCAGT CTCCACCGCT TTGGGCTGTC CTGATATTTC TATGGTCCGC AAAGGGGGAC 6900 

GAAGCrCCCr CGT6CGCTCT CCTGITCCGA CCCTGCCGCT TACCGGATAC CTGTCCGCCT 6960 

CTTCGAGGGA GCACGCGAGA GGACAAGGCT GGGACGGCGA ATGGCCTATG GACAGGCGGA 6960 

TTCTCCCTTC GGGAAGCGTG GCGCTTTCTC ATAGCTCACG CTGTAGGTAT CTCAGTTCGG 7020 

AAGAGGGAAG CCCTTCGCAC CGCGAAAGAG TATCGAGTGC GACATCCATA GAGTCAAGCC 7020 
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TGTAGGTCGT TCGaCCAAG CTGGG.CTGTG TGCACGAACC CCCCGTTCAG CCCGACCGCT 7080 

ACATCCAGCA AGCGA6G1TC GACCCGACAC ACGTGCTTGG GGGGCAAGTC GGGCTGGCGA 7080 

GCGCCTTATC CGGTAACTAT CGTGTTGAGT CCAACCCGGT AAGACACGAC TTATCGCCAC 7140 

CGCGGAATAG 6CCAT7GATA GCAGAACTCA GGTTGGGCCA TTCTGTGCTG AATAGCGGTO 7140 

TGGCAGCAGC CACTGGTAAC AGGATTAGCA GAGCGAGGTA TGTAGGCGGT GCTACAGAGfT 7200 

ACCGTCGTCG GTGACCATTG TCCTAATCGT CTCGCTCCAT ACATCCGGCA CGATGTCTGA 7200 

TCTTGAAGTG GTGGCCTAAC TACGGCTACA CTAGAAGAAC AGTATTTSGT ATCTGCGCTC 7260 

AGAAGTTCAC CACCGGATTG ATGCCGAT6T GATCTrCTTlS TCATAAACCA TAGACGCGA6 7260 

TGCTGAAGCC AGTTACCTTC GGAAAAAGAG TTGGTAGCTC TTGATCCGGC AAACAAACCA 7320 

ACGACTTCGG TCAATGGAAG CCIIIIICTC AACCATCGAG AACTAGGCCG IliGlliGGT 7320 

CCGCTGGTAG CGGTGGTTTt TTTGTTTCCA AGCAGCAGAT TAC6CGCAGA AAAAAAGGAT 7380 

GGCGACCATC GCCACCAAAA AAACAAACGT TCGTCGTCTA ATGCGCGTCT IIIIIICCTA 7380 

CTCAAGAAGA TCCTITGATC TTTTCTACGG GGTCTGACGC TCAGTGGAAC GAAAACTCAC 7440 

GAGrrCTTCT AGGAAACTAG AAAAGATGCC CCAGACTGCG AGTCACCTTG CTTTTGAGTG 7440 

GTTAAGGGAT mGGTCATG AGAmiCAA AAAG6ATCTT CACCTAGATC CTTTTGCGGC 7500 

CAATTCCCTA AAACCAGTAC TCTAATAGTT TTTCCTAGAA GTGGATCTAG GAAAACGCCG 7500 

CGCAAATCAA TCTAAAGTAT ATATGAGTAA ACTTGGTCTG ACAGTTACCA ATGCTTAATC 7560 

GCGTTTAGTT AGATTTCATA TATACTCATT TGAACCAGAC TGTCAATGGT TACGAATTAG 7560 

AGTGAGGCAC CTATCTCAfiC GATCTGTCTA TTTCGTTCAT CCATAGTTGC CFGACTCCCC 7620 

TCACTCCGTG GATAGAGTCG CTAGACASAT AAA6CAAGTA GGTATCAACG GACTGAGGGG 7620 

GTCGTGTAGA TAACTACGAT ACGGGAGGGC TTACCATCTG GCCCCAGTGC TGCAATGATA 7680 

CAGCACATCT AnCATGCTA TGCCCTCCCG AATGGTAGAC CGGGGTCACG ACGTTACTAT 7680 

CCGCGAGACC CACGCTCACC GGCTCCAGAT TTATCAGCAA TAAACCAGCC AGCCGGAAGG 7740 

GGCGCTCTGG CTGCGAGTGG CCGAGGTCTA AATAGTC6TT ATTTSGTCGG TCGGCCTTCC 7740 

GCCGAGCGCA GAAGTGGTCC TGCAACTTTA TCCGCCTCCA TCCAGTCTAT TAATTGHGC 7800 

CGGCTCGCGT CTTCACCAGG ACGHGAAAT AG6CGGAGGT AGGTCAGATA ATTAACAACG 7800 
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CGGGAAGCTA 6AGTAAGTAG TTCGCCAGTT MTAGTTTGC GCAAC67TGT TGCCATTGCT 7860 

GCCCTTCGAT CTCATTCATC AAGCGGTCAA TTATCAAACG CGHGCAACA ACGGTAACGA 7860 

ACAGGCATCG TGGTCTCACG CrCGTCGTTT GGTATGGCTT CATTCAGCTC CGGTTCCCAA 7920 

TCTCGGTAGC ACCACAGTGC 6AGCA6CAAA CCATACCGAA GTAAGTC6AG 6CCAAGGGTT 7920 

CGATCAAGGC GAGTTACATG ATCCCCCATG TTGTGCAAAA AAGCGGTTAG CTCCTTCGGT 7980 

GCTAGTTCCG CTCAATGTAC TA6GGGGTAC AAGACGTTTT TTCGCCAATC GAGGAAGCGA 7980 

CCreCGATCG TTGTCAGAAG TAAGTTGGCC GCAGTGTTAT GACTCATGGT TATGGCAGCA 8040 

GGAGGCTAGC AACAGICTTC ATTCAACCGG C6TCACAATA GTGAGTACCA ATACCGTCGT 8040 

CTGCATAATT CTCTTACTGT CAT6CCATCC GTAAGATGCT TTTCTGrGAC TGGTGAGTAG 8100 

6ACGTATTAA GAGAATGACA GTACGGTAGG CATTCTACGA AAAGACACTG ACCACTCATG 8100 

TCAACCAAGT CATTCTGAGA ATAGTGTATG CGGCGACCGA GTTGCTCrTG CCCGGCGTCA 8160 

AGnGGTTCA GTAAGACTCT TATCACATAC GCCGCTGGCT CAACGAGAAC GGGCCGCAGT 8160 

ATACGGGATA ATACCGCGCC ACATAGCAGA ACTTTAAAAG TGCTCATCAT TGGAAAACGT 8220 

TATGCCCTAT TATGGCGCGG TGTATCGTCT TGAAATTTTC ACGAGTAGTA ACCmTGCA 8220 

TCTTGGGGGC GAAAACTCTC AAGGATGTTA CCGCTGTTGA GATCCAGTTC GAIGTAACCC 8280 

AGAAGCCCCG CTTTTGA6AG TTCCTAGAAT GGCGACAACT CTAGGTCAAG CTACATTGGG 8280 

ACTCGTGCAC CCAACT6ATC TTCAGCATCT TTTACTTTCA CCAGC6TTTC TGGGTGAGCA 8340 

TGAGCACGT6 GGTTGACTAG AAGTCGTAGA AAATGAAAGT GGTCGCAAAG ACCCACTCGT 8340 

AAAACAGGAA GGCAAAATGC CGCAAAAAAG GGAATAAGGG CGACACGGAA ATIGTTGAATA 8400 

TrtTGTCCTT CCGmTACG GCGIIIIIIC CCTTATTCCC GCTGTGCCTT TACAACTTAT 8400 

CTCATACrCT TCCINIICA ATATTATTGA AGCATTTATC AGGGITATTG TCTCATGAGC 8450 

GAGTATGAGA AGGAAAAAGT TATAATAACT TCGTAAATAG TCCCAATAAC AGAGTACTCG 8460 

GGATACATAT TT6AATGTAT TTAGAAAAAT AAACAAATAG GGGTTCCGCG CACATTTC 8518 

CCTATGTATA AACTTACATA AATCIIIIIA TTTGTTTATC CCCAAGGCGC GTGTAAAG 8518 
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CreCAGCCre AATATGGGCC AAACAGGATA TCTGTGGTAA GCAGTTCCTG CCCCGGCTCA 60 

GACGTCG6AC TTATACCCGG TTTCTCCTAT AGACACCATT CGTCAAGGAC GGGGCCGAGT 60 

GGGCCAAGAA CAGATGGAAC AGCTGAATAT GG6CCAAACA GGATATCTCT GGTAAGCAGT 120 

CCCGGTTCTT GTCTACCTTG TCGACTTATA CCCGGTTTGT CCTATAGACA CCATTCGTCA 120 

TCCTGCCCCG 6CTCAGG6CC AA6AACAGAT GGTCCCCA6A TGCGGTCCAG CCCTCAGCAG 180 

AG6ACGGGGC CGAGTCCCGG TTCTTGTCTA CCA6GGGTCT ACGCCAGGTC GGGAGTCGTC 180 

TTTCTAGAGA ACCATCAGAT GTITCCAGGG TGCCCCAAGG ACCTGAAATG ACCCTGreCC 240 

AAA6ATCTCT TGGTAGTCTA CAAAGGTCCC AC6GGGTTCC T6GACTTTAC TGGGACACGG 240 

TTATTTGAAC TAACCAATCA GTTCGCTTCT CGCTTCTGTT CGCGCGCTTC TGCTCCCCGA 300 

AATAAACTTG ATTGGTTAGT CAAGCGAAGA GCGAAGACAA 6CGCGCGAAG ACGAGGGGCT 300 

GCTCAATAAA AGAGCCCACA ACCCCTCACT CGGGGCGCCA GTCCTCCGAT TGACTGA6TC 360 

CGAGTTATrr TCTCGGGTGT TGGGGAGTGA GCCCCGCGGT CAGGAGGCTA ACT6ACTGAG 360 

GCCCGGGTAC CCGTGTATCC AATAAACCCT CTTGCAGTTG CATCCGACTT GTGGTCTCGC 420 

CGGGCCCATG GGCACATAGG TTAnTGGGA GAACGTCAAC GTAGGCTGAA CACCAGAGCG 420 

TGTrCOTGG 6AGGYTCTCC TCT6AGT6AT TGACTACCCG TCAGCGGGGG TCTTTCATTT 480 

ACAAGGAACC CTCCCAGAGG AGACTCACTA ACTGATGGGC AGTCGCCCCC AGAAAGTAAA 480 

GGGGGCTCGT CCGGGATCGG GAGACCCCTG CCCA6G6ACC ACCGACCCAC CACCGGGA6G 540 

CCCCCGAGCA GGCCCTAGCC CTCTGGGGAC GGGTCCCTGG TGGCTGGGTG GTGGCCCTCC 540 

CAAGC7BGCC AGCAACTTAT CTGTGTCTGr CCGATTGTCT AGTGTCTATG AaGATTTTA 600 

GTTCGACCGG TCGTTGAATA GACACAGACA GGCTAACAGA TCACAGATAC TGACTAAAAT 600 

TGCGCCTGCG TCGGTACTAG TTAGaAACT AGCTCT6TAT CTGGCGGACC CGTGGTGGAA 660 

ACGCGGACGC AGCCATGATC AATCGATTGA TCGAGACATA GACCGCCTGG GCACCACCTT 660 

aGACGAGTT CTGAACACCC GGCCGCAACC CTGGGAGACG TCCCAGGGAC T7TGGGGGCC 720 

GACTGCrCAA GACTTGTGGG CCGGCGTTGG GACCCTCTGC AGGGTCCCTG AAACCCCCGG 720 

GmTTGTGG CCCGACCTGA GGAAGGGAGT CGATGTGGAA TCCGACCCCG TCAGGATATG 780 

CAAAAACACC 6G6CTG6ACT CCTTCCCTCA GQACACCTT AGGCTGGGGC AGTCCTATAC 780 
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TGffTTCTGGT AGGAGACGAG AACC7AAAAC AGTTCCCGCC TCCGTCTGAA TmTGCTTT 840 

ACCAAGACCA TCCTCTGCTC TTGGATTTTG TCAAGGGCGG AGGCAGACTT AAAAACGAAA 840 

CGGTTTGGAA CCGAAGCCGC GCCTCTTGTC TGCTGCAGCA TCGTTCTCTG TTffrCTCreT 900 

GCCAAACGT GGCTTCGGC6 CGCAGAACAG ACGACGTCGT AGCAAGACAC AACA6AGACA 900 

CTGACTGTGT TrCTGTATTT GTCTGAAAAT TAGGGCCAGA CTGTTACCAC TCCCTTAAGT 960 

GACTGACACA AAGACATAAA CAGACmTA ATCCCGGTCT GACAATGGTG AGGGAATTCA 960 

TTGACCITAG GTAACTGGAA AGATGTCGAG CGGCTCGCTC ACAACCAGTC GGTAGATCTC 1020 

AACTGGAATC CATTGACCTT TCTACAGCTC 6CCGAGCGAG TGTTGGTCA6 CCATCTACAG 1020 

AAGAAGAGAC GTTGGGTTAC CTTCTGCTCT GCAGAATGGC CAACCTTTAA CGTCGGATGG 1080 

TTCTTCTaG CAACCCAATG GAAGACGAGA CGTCTTACCG GTTGGAAATT GCAGCCTACC 1080 

CCGC6AGAC6 GCACCTTTAA CCGAGACCTC ATCACCCAGG TTAAGATCAA GGTCmTCA 1140 

GGCGCTCTGC CGTGGAAATT 6GCTCTGGAG TAGTG6GTCC AATrCTAGTT CCAGAAAAGT 1140 

CCTGGCCCGC ATGGACACCC AGACCAGffTC CCCTACATCG TGACCTGGGA A6CCTTGGCT 1200 

GGACCGGGCG TACCTGT6GG TCTGCTCCAG GGGATGTA6C ACTGGACCCr TCGGAACCGA 1200 

TTTGACCCCC CTCCCTGGGT CAAGCCCTTT GTACACCCTA AGCCTCC6CC TCCTCTTCCT 1260 

AAACTGGGGG GAGGGACCCA GTTCGGGAAA CATGTGGGAT TCGGAGGCGG AGGAGAAGGA 1260 

CCATCCGCCC CGTCTCTCCC CCTTGAACCT CCTCGTTCGA CCCC6CCTCG ATCCTCCCTT 1320 

GGTAGGCGGG GCAGAGAGGG GGAACTTGGA GGAGCAAGCT GGGGCGGAGC TAGGAGGGAA 1320 

TATCCAGCCC TCACTCCTTC TCTAGGCGCC GGCCGCTCTA GCCCATTAAT ACGACTCACT 1380 

ATAGGTCGGG AGTGAGGAAG AGATCCGCGG CCGGCGAGAT CGGGTAATTA TGCTGAGTGA 1380 

ATAGGGCGAT TC6AATCAGG CCTTGGCGCG CCGGATCC7T AATTAAGCGC AATTGGGAGG 1440 

TATCCCGCTA AGCTTAfiTCC GGAACCGCGC GGCCTAGGAA TTAATTCGCG TTAACCaCC 1440 

TGGCGGTAGC CTCGAGATGG GCGTGATTAC GGATTCACTG GCC6TC6TTT TACAACGTCG 1500 

ACCGCCATCG GA6CTCTACC CGCACTAA7S CCTAAGTGAC CGGCAGCAAA A7GTTGCAGC 1500 

TGACTGGGAA AACCCTGGCG TTACCCAACT TAATCGCCTT GCAGCACATC CCCCTTTCGC 1560 

ACTGACCCTT TTGGGACC6C AATGGGTTGA ATTAGC6GAA CGTCSTGTAG GGGGAAAGCG 1560 
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CAGCTGGCGT AATAGCGAAG AGGCCCGCAC CGATCGCCCT TCCCAACAGT TACGCAGCCT 1620 

6TCGACCGCA TTATCGCITC TCCGGGCGTG GCTAGCGGGA AGGGTTGTCA ATGCGTCGGA 1620 

GAATGGCGAA TCGCGCTTTG CCTGGnTCC GGCACCAGAA GC6GTGCCGG AAAGCTGGCT 1680 

CTTACCGCTT ACCGCGAAAC 6GACCAAAGG CCGUGGTCTT CGCCACGGCC TTTCGACCGA 1680 

GGAGTGCGAT CTTCCTGAGG CCGATACTGT CGTCGTCCCC TCAAACTGGC AGATGCACGG 1740 

CCTCACGCTA GAAGGACTCC GGCTATGACA GCAGCAGGGG AGnTGACCG TaACGTGCC 1740 

TTACGATGCG CCCATCTACA CCAACGFGAC CTATCCCATT ACGGTCAATC CGCCGTTTCT 1800 

AATGCTACGC GGGTAGATGT GGfTIGCACTG 6ATAGGGTAA TGCCAffTTAG GCGGCAAACA 1800 

TCCCACGGAG AATCCGACGG GTTGTrACTC GCTCACATTT AATGTT6ATG AAAGCTGGCT 1860 

AGGGTGCCTC TTAGGCTGCC CAACAATGAG CGAGTGTAAA TTACAACTAC TTTCGACCGA 1860 

AGA66AAGGC CAGACGCGAA TTATnTTGA TGGCGTTAAC TCG6CGTTTC ATCTGTGSTG 1920 

TGTCCrrCCG GTCTGCGCTT AATAAAAACT ACCGCAATTG AGCCGCAAAG TAGACACCAC 1920 

CAACGGGCGC TGGGTCGGTT ACGGCCAGGA CAGTCGTTTG CCGTCTGAAT TTGACCTGAG 1980 

GTTGCCCGCG ACCCAGCCAA TGCCGGTCCT GTCAGCAAAC GGCAGACTTA AACTGGACTC 1980 

CGCATTTTTA CGCGCCGGAG AAAACCGCCT CGCGGTGATG GIGCTSCGCT GGAGTGACGG 2040 

GCGTAAAAAT GCGCGGCCTC TTTTGGCGGA GCGCCACTAC CACGACGCGA CCTCACTGCC 2040 

CAGTTATCTG GAAGATCAGG ATATGTGGCG GATGAGCGGC ATnTCCGTG ACGTCTCGTT 2100 

GTCAATAGAC CTTCTAGTCC TATACACCGC CTACTCGCCG TAAAAGGCAC TGCAGAGCAA 2100 

GCTGCATAAA CCGACTACAC AAATCAGC6A TnCCATGTT GCCACTCGCT TTAATGATGA 2160 

CGACGTATTT GGCTGATGTG TITAGTCGCT AAAGGTACAA C6GTGAGCGA AATTACTACT 2160 

TTTCAGCCGC GCTGTACTGG AGGCTGAAGT TCAGATGTGC GGCGAGTTGC GTGACTACCT 2220 

AAAGTCGGCG CGACATGACC TCCGACTTCA AGTCTACACG CCGCTCAACG CACTGATGGA 2220 

ACGGGTAACA GlliClllA T G6CAGGGTGA AACGCAG6TC GCCAGCGGCA CCGCGCCnT 2280 

TGCCCATTGT CAAAGAAATA CCGTCCCACT TTGCCTCCAG CGGTCGCCGT GGCGCGGAAA 2280 

CGGCGGT6AA AHATCGATG A6CGTGGTGG TTATGCCGAT CGCGTCACAC TACGTCTGAA 2340 

GCCGCCACTT TAATAGCTAC TCGCACCACC AATACGGCTA GCGCAGT6TG ATGCAGACTT 2340 
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CGTCGAAAAC CCGAAACTGT S6AGC6CCGA AATCCCGAAT CTCTATCGTG CGGT6GTTGA 2400 

GCAGCTTTTG GGCTTHSACA CCTCGCGGCT TTAGGGCTTA GAGATAGCAC 6CCACCAACT 2400 

ACTGCACACC 6CCGACGGCA CGCTGATTGA AGCAGAAGCC T6CGATGTCG STTTCCGCGA 2460 

TGACGTGTGG CGGCTGCCCT 6CGACTAACT TCGTCTTCGG ACGCTACAGC CAAAGGCGCT 2460 

GGTGCGGATT GAAAATGGTC TGCTGCTGCT GAACGGCAAG CCGTTGCrGA TTC6AGGCGT 2520 

CCACGCCTAA CmTACCAG ACGACGACGA CTTGGGGTTC GGCAACGACT AAGCTCCGCA 2520 

TAACCGTCAC GAGCATCATC CTCTGCATGG TCAGGTCATG 6ATBAGCAGA CGATGGTGCA 2580 

ATIIGGCAGTG CTCGTAGTAG GAGACGTACC AGTCCAGTAC CTACTCGTCT GCTACCACGT 2580 

GGATATCCTG CTGATGAAGC AGAACAACTT TAACGCCGTG CGCTGTTCGC ATTATCCGAA 2640 

CCTATAGGAC GACTACTTCG TCTTGTTGAA ATTGCGGCAC GCGACAA6CG TAATAGGCTT 2640 

CCATCCGCTG TGGTACACGC TGTGCGACCG CTACGGCCTG TATGTGGTIGG ATGAAGCCAA 2700 

GGTAGGCGAC ACCATGT6CG ACACGCTGGC GATGCCGGAC ATACACCACC TACTTCGGTT 2700 

TATTGAAACC CAC66CATGG TGCCAAT6AA TCGTCTGACC GATGATCC6C GaGGCTACC 2760 

ATAACnTGG GTGCCGTACC ACGGTTACTT A6CAGACTGG CTACTAGGCG C6ACCGATGG 2760 

GGCGATGAGC GAACGCGTAA CGCGAATGGT GCAGCGCGAT CGTAATCACC CGAGTGTGAT 2820 

CCGCTACrCG CTTGCGCATT GCGCTTACCA CGTCGCGCTA GCATTAGTTGG GaCACACTA 2820 

CATCTGGTCG CTGGGGAATG AATCAGGCCA CGGCGCTAAT CACGACGC6C TGTATCGCTG 2880 

GTAGACCAGC GACCCC7TAC TTAGTCCGGT GCCGCGATTA 6TGCTGCGCG ACATAGC6AC 2880 

GATCAAATCT GTCGATCGT CCCGCCCGGT GCAGTATGAA GGCGGCGGAG CCGACACCAC 2940 

CTASnTAGA CAGCTAGGAA QGGCGGGCCA CGTCATACTT CCGCC6CCTC GGCTGTGGTG 2940 

GGCCACCGAT ATTATTTGCC CGATGTACGC GCGCGTGGAT GAAGACCAGC CCTTCCCGGC 3000 

CCGGTGGCTA TAATAAAC66 GCTACATGC6 CGCGCACCTA CTTCTGGTCG GGAAGGGCCG 3000 

TCTGCCGAAA TGGTCCATCA AAAAATGGCT TTCGCTACCT GGAGAGACGC GCCCGCTGAT 3060 

ACACGGCTTT ACCAGGTAGT TTTTTACCGA AAGCGATG6A CCTCTCTGCG CGGGCGACTA 3060 

CCTTTGCGAA TACGCCCACG CGATGGGTAA CAGTCTTGGC GGTTTCGCTA AATACTGGCA 3120 

GGAAACGCTT ATGCGGGTGC GCTACCCATT GTCAGAACC6 CCAAAGCGAT TTATGACCGT 3120 
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GGCSnrCGT CAGTATCCCC GTTTACAGG6 CGGCTTCGTC TGGGACTG6G TGGATCAGTC 3180 
CCGCAAAGCA GTCATAGGGG CAAATGTCCC GCCGMGCAG ACCCTGACCC ACCTAGTCAG 3180 

GCTGATTAAA TATGATGAAA ACGGCAACCC GTSGTCGGCT TACGGCGGT6 ATnTGGCGA 3240 
CGACTAATTT ATACTACTTr TGCCGTTGGG CACCAGCGGA ATGCCGCCAC TAAAACCGCT 3240 

TACGCCGAAC GATCGCCAGT TCTGTATGAA C6GTCT6GTC TTTGCC6ACC GCACGCCGCA 3300 
ATGCGGCTTG CTAGCGGTCA AGACATAGTT GCCAGACCAG AAACGGCTGG CGTGCGGCGrF 3300 

TCCAGCGCTG ACGGAAGCAA AACACCAGCA GCAGTITITC CAGnCCGTT TATCCGGGCA; 3360 

AGGTCGCGAC TGCCTTCGTT TTSTGGTCGT CGTCAAAAAG GTCAAGGCAA ATAGGGCCGT . 3360 

AACCATCGAA GTGACCAGCG AATACCTGTT CCGTCATAGC GATAACGAGC TCCTGCACTG 3420 

TTGGTAGCTT CACTGGTC6C TTATGGACAA GGCAGTATCG CTATTGCTCG AGGACGTGAC 3420 

GATGGT6GCG aGGATGGTA AGCCGCTGGC AAGCGGTGAA 6TGCCTCTGG ATGTCGCTCC 3480 

CTACCACCGC GACCTACCAT TCGGCGACCG TTCGCCACTT CACGGAGACC TACAGCGAGG 3480 

ACAAGGTAAA CAGTTGATTG MCTGCCTGA ACTACCGCAG CCGGAGAGCG CCGGGCAACT 3540 

TGTTCCATTT GTCAACTAAC TTGACGGACT TGATGGCGTC GGCCTCTCGC GGCCCGTTGA 3540 

CTGGCTCACA 6TACGCGTAG TGCAACCGAA CGCGAGC6CA TGGTCAGAAG CCGGGCACAT 3600 

GAGCGAfiTGT CATGCGCATC AC6TTGGCTT GCGCTGGCGT ACCAGTCTTC G6CCCGTGTA 3600 

CAGCGCCTGG CAGCAGTGGC GTCTGGCGGA AMCCTCAGT GTGACGCTCC CCGCCGCGTC 3660 

GTC6CGGACC GTCGTCACCG CAGACCGCCT TTTGGAGTCA CACTGCGAGG GGC66CGCAG 3660 

CCACGCCATC CCGCATCTGA CCACCAGCGA MTGCATTTT TGCATCGAGC TGGGTAATAA 3720 

GGTCCGGTAG GGCGTAGACT GGTGGTCGCT TTACCTAAAA ACGTAGCTCG ACCCATTATT 3720 

GCGrrGGCAA TTTAACCGCC AGTCAGGCTT TCTTTCACAG ATGTGGATTG GCGATAAAAA 3780 

CGCAACCGTT AAATTGGCGG TCAGTCCGAA AGAAAGTGTC TACACCTAAC CGCTATmT 3780 

ACAACTGCTG ACGCCGCTGC 6CGATCAGTT CACCCGTGTC 6ATAGATCTG AACAGAAACT 3840 

TGTTGACGAC TGCGGCGACG CGCTAGTCAA GTGGGCACAG CTATCTAGAC TTGTCTTTGA 3840 

CATTTCCGAA GAAGACCTAG TCGACCATCA TCATCATCAT CACCGGTAAT AATAGGTAGA 3900 

GTAAAGGCTT CTTCTGGATC AGCTGGTAGT AGTAGTAGTA GTGGCCATTA TTATCCATCT 3900 
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TAAGTGACTG ATTAGATGCA 7TTCGACTAG ATCCCTCGAC CAATTCCGGT TATTTTCCAC 3960 

ATTCACTGAC TAATCTACGT AAt»GCTGATC TAGGGAGCTG GHAAGGCCA ATAAAAGGTG 3960 

CATATTGCCG TCIl ilGGCA ATGTCAGGGC CCGGAAACCT GGCCCTGTCT TCTTGACGAG 4020 

GTATAACGGC AGAAAACCGT TACACTCCCG GGCCTTTGGA CC6G6ACAGA A6AACTGCTC 4020 

CATTCCTAGG GGTCTTTCCC CTCTCGCCAA AGGAATGCAA GGTCTGTTGA ATGTCGTGAA 4080 

GTAAGGATCC CCAGAAAGGG GAGAGCGGTT TCCTTACGTT CCAGACAACT TACAGGACTT 4080 

GGAAGCAGTT CCTOGGAAG CTTCTTGAAG ACAAACAACG TCTGTAGCGA CCCTTTGGAG 4140 

CCTTCGTCAA GGAGACCTTC GAAGAACTTC TGTTTGTTGC AGACATC6CT GGGAAACGTC 4140 

GGAGCGGAAC CCCCCACCTG GCGACAGGTG CCTCTGCGGC CAAAAGCGAC GTGTATAAGA 4200 

GGTCGCCTT6 G6GGGTGGAC CGCTGTCCAC GGAGACGCCG GTnTCGGTG CACATATTCT 4200 

TACACCTGCA AAGGCGGCAC AACCCCAGTG CCACGTTGTG AGTT6GATAG TTSTGGAAAG 4260 

ATGTGGACGT TTCCGCCGTG TTGGGGTCAC GGTGCAACAC TCAACCTATC AACACCTTTC 4260 

AGTCAAATGG CTCTCCTCAA GCGTATTCAA CAAGGGGCTG AAGGATGCCC AGAAGGTACC 4320 

TCAGTTTACC GAGAGGAGTT eGCATAAGTT GTTCCCCGAC TTCCTACGGG TCTTCCATGG 4320 

GCAT7GTATG GGATCTGATC TGGGGCCTCG GTGCACATGC TTTACATGTG TTTAGTCGAG 4380 

GGTAACATAC CCTAGACTAG ACCCCGGAGC CACGTGTACG AAATGTACAC AAATCAGCTC 4380 

GTTAAAAAAC GTCTAGGCCC CCCGAACCAC GGGGACGTGG TnTCCnTG AAAAACAC6A 4440 

CAAIIIIIIG CAGATCCGGG GGGCTTGGTG CCCCTGCACC AAAAGGAAAC TmTGTGCT 4440 

TGATAATACC ATGAAAAAGC CTGAACTCAC CGCGACGTCT GTCGAGAAGT TTCTGATCGA 4500 

ACTATTATGG TACTTTTTCG GACTTGAGTG GCGCTGCAGA CAGCTCTTCA AAGACTAGCT 4500 

AAAGTTCGAC AGCGTCTCCG ACCTGATGCA GCTCTCGGAG GGCGAAGAAT CTCGTGCTTT 4560 

TTTCAAGCTG TCGCAGAGGC TGGACTACGT CGAGAGCCTC CCGCTTCTTA GAGCACGAAA 4560 

CAGCTTCGAT GTAGGAGGGC GTSGATATGT CCTGCGGGTA AATAGCTGCG CCGATGGTTT 4620 

6TCGAAGCTA CATCCTCCC6 CACCTATACA GGAC6CCCAT TTATCGACGC GGCTACCAAA 4620 

CTACAAAGAT C6TTATGTTT ATCGGCACTT TGCATCGGCC 6CGCTCCCGA TTCCGGAAGT 4680 

GATGirrCTA GCAATACAAA TAGCCGTGAA ACGTAGCCGG CGCGAGGGCT AAGGCCTTCA 4680 
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GCTTGACATT GGGGAATTTA 6CGAGAGCCT GACCTATTGC ATCTCCCGCC GTGCACAGGG 4740 

CGAACTSTAA CCCCTTAAAT CGCRCTCGGA CTGGATAACG TAGAGGGCGG CACGTGTCCC 4740 

TGTCACGTTG CAAGACCTGC CTGAAACC6A ACT6CCCGCT GTTCTGCAGC CGGTC6CGGA 4800 

ACAGTGCAAC GTTCTGGACG GACnTGGGT TGACGGGCGA CAAGACGTCG GCCAGCGCCT 4800 

GGCCATGGAT GCGATCGCTG CGGCCGATCT TA6CCAGACG AGCGGGTTCG GCCCATTCGG 4860 

CCGGTACCTA CGCTAGCGAC GCCGGCTAGA ATCGGTCTGC TCGCCCAAGC CGGGTAAGCC 4860 

ACCGCAAGGA ATCGGTCAAT ACACTACATG GCGTGATTTC ATATGCGCGA TreCTGATCC 4920 

TCGCGTTCCT TAGCCAGTTA TGTGATGTAC CGCACTAAAG TATACGCGCT AACGACTAGG 4920 

CCATGTGTAT CACTGGCAAA CTGTGATGGA CGACACCGTC AGTGCGTCCG TCGCGCAGGC 4980 

GGTACACATA GT6ACCGTTT GACACTACCT GCTGTGGCAG TCAC6CAGGC AGCGCGTCCG 4980 

TCTCGATGAG CTGATGGTT GGGCC6AGGA CTGCCCCGAA GTCCGGCACC TCGTGCACGC 5040 

AGAGCTACTC GACTACGAAA CCCGGCTCCT GACGGGGCTT CAGGCCGTGG AGCACGTGCG 5040 

GGATTTCGGC TCCAACAATG TCCTGACGGA CAATGGCCGC ATAACAGCGG TCATTGACTG 5100 

CCTAAAGCCG AGGTTGTTAC AGGACTGCCT GTTACCGGCG TATTGTCGCC AGTAACTGAC 5100 

GAGCGAGGCG ATGTTCGGGG ATTCCCAATA CGAGGTCGCC AACATCTTCT TCTGGAGGCC 5160 

CTCGaCCGC TACAAGCCCC TAAGGGTTAT GCTCCAGCGG TTGTAGAAGA AGACCTCCGG 5160 

GTGGTTGGCT TGTATGGAGC AGCAGACGCG CTACTTCGAG CGGA6GCATC CGGAGCTTGC 5220 

CACCAACCGA ACATACCTCG TCGTCTGCGC GATGAAGCTC GCCTCCGTAG GCCTCGAACG 5220 

AGGATCGCCG CGGCTCCGG6 CGTATATGCT CCGCATTGGT CTTGACCAAC TCTATCAGAG 5280 

TCCTAGCGGC GCCGAQGCCC GCATATACGA GGCGTAACCA GAACTGCTTG AGATAGTCTC 5280 

CTTGGTTGAC GGCAATTTCG ATGATGCAGC TTGGGCGCAG GGTCGATGCG ACGCAATC6T 5340 

GAACCAACTG CCGTTAAAGC TACTACGTCG AACCCGCGTC CCAGCTACGC TGCGHAGCA 5340 

CCGATCCGGA GCCGGGACTG TCGGGCGTAC ACAAATCGCC CGCAGAAGCG CGGCCGTCTG 5400 

G6CTAGGCCT CGGCCCTGAC AGCCCGCATG TGnTAGCGG GCGTCTTCGC GCCGGCAGAC 5400 

GACCGATGGC TGTGTAGAAG TACTCGCCGA TAGTGGAAAC CGACGCCCCA GCACTCGTCC 5460 

CTGGCTACCG ACACATCTTC ATGAGCGGCT ATCACCTTTG GCTGCGGGGT CGTGAGCAGG 5460 
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6AGGGCAAAG GAATAGAOTA aVTGCCGACC GGGATCTATC GATAAAATAA AAGATTTTAT 5520 

CTCCCGTTTC CTTATCTCAT d'ACGGCIGG CCCTAGATA6 CTATTTTATT TTCTAAAATA 5520 

TTAGTCrCCA GAAAAAGGGG GGAATGAAAG ACCCCACCTG TAGGTrTGGC AAGCTAGCTT 5580 

AATCAGAGGT CTTTTTCCCC CCTTACTTTC TGGGGTGGAC ATCCAAACCG TTCGATCGAA 5580 

AAGTAACGCC ATnTGCAAG GCATGGAAAA ATACATAACT GAGAATAGAG AAGTTGAGAT 5640 

TTCATTGCGG TAAAACGTTC CGTACCTTTT TATGTATTGA CTCTTATCTC TTCAAffTCTA 5640 

CAAGGTCAGG AACAGATG6A ACAGCTGAAT ATGGGCCAAA GAGGATATCT GTGGTAAGCA 5/00 

GrrccAfiTCC TrercrACcr tgtcgactta tacccggttt gtcctataga caccattcgt 5700 

GTTCCTGCCC CGGCTCAGGG CCAAGAACAG ATGGAACAGC TGAATATGGG CCAAACAGGA 5760 

CAAGGACGGG 6CC6AGTCCC GGTTCTTGTC TACCTTGTCG ACTTATACCC GGTTTGTCCT 5760 

TATCTGTGGT AAGCAGTTCC TGCCCCGGCT GAGGGCCAAG AACAGATCGT CCCCAGAT6C 5820 

ATAGACACCA TTCGTCAAG6 AC6QGGCCGA GTCCCGGTTC TTGTCTACCA GGGGTCTAC6 5820 

GGTCCAGCCC TCAGCAGTTT CTAGAGAACC ATCAGATGTT TCCAGGGTGC CCCAAG6ACC 5880 

CCA6GTCGGG AGTCGTCAAA GATCTCTTGG TAGTCTACAA AGGTCCCACG GGGTTCCTGG 5880 

TGAAATGACC CTGTGGCTTA TTTGAACTAA CCAATCAGTt CGCTTCTC6C TTCTGTTCGC 5940 

ACnTACTGG GACACGGAAT AAACTTGATT GGTTAGTCAA eCGAAGAGCG AAGACAAGCG 5940 

6CGCTTCTGC TCCCCGAGCT CAATAAAAGA GCCCACAACC CCTCACTCGG GGCGCCAGTC 6000 

CGCGAAGACG AGGG6CTCGA GTTATnTCT CGGGTGTTGG GGAGTGAGCC CCGCGGTCAG 6000 

CrCCGATTGA CT6AGTCGCC CGGGTACCCG TCTATCCAAT AAACCCTCTT GCAGTTGCAT 6060 

6AGGCTAACT GACTCA6CGG GCCCATGGGC ACATAQGTTA TTTGGGAGAA CffTCAACGTA 6060 

CCGACTTGTG GTCTCGCTGT TCCTTGGGAG GGTCTCCTCT GAGTGATTGA CTACCCGTCA 6120 

GGCTGAACAC CAGAGCGACA AGGAACCCTC CCAGAGGAGA CTCACTAACT GATG6GCAGT 6120 

GCGGGGGTCT TTCAnCATG CAfiCATGTAT CAAAATTAAT TTGGI 1 1 1 1 1 TTCTTAAGTA 6180 

CGCCCCCAGA AAGTAAGTAC GTCGTACATA GnTTAAnA AACCAAAAAA AAGAATTCAT 6180 

TTrACATTAA ATGGCCATAG TTGCATTAAT GAATCGGCCA ACGCGCGGGG AGAGGCGGTT 6240 

AAATGTAATT TACCGGTATC AACGTAATTA CTTAGCCGGT TGCGC6CCCC TCTCCGCCAA 6240 
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TGCGTATTGG CGCTCTTCCG CTTCCTCGCT CACTGACTCG CTGCGCTC6G TCGTTCGGCT 6300 

ACGCATMCC GCGAGAAGGC GAAGGAGCGA GT6ACTGAGC GACGCGA6CC A6CAAGCCGA 6300 

GCGGCGAGCG GTATCAGCTC ACTCAAAGGC GGTAATACGG TTATCCACAG AATCAGGGGA 6360 

CGCGGCTCGC CATAGTC6AG TGAGTTTCCG CCATTATGCC AATAGGTGTC TTAGTCeCa 6360 

TAACGCAGGA AAGAACATCT GAGCAAAAGG CCAGCAAAAG GCCAGGAACC GTAAAAAGGC 6420 

ATTGCGTCCT TTCTTGTACA CTCGTFTTCC GGTCGTTTTC C6GTCCTTG6 CATmTCCG 6420 

CGCGTTGCT6 GCGTmrCC ATAGGCTCCG CCCCCCTGAC GAGCATCACA AAAATCGACG 6480 

GCGCAAC6AC CGCAAAAAGG TATCCGAGGC GGGGGGACTG CTCGTAGTGT mTAGCTGC 6480 

CTCAAGTCAG AGGTG6CGAA ACCCGACAGG ACTATAAAGA TACCAGGCGT TTCCCCCTGG 6540 

GAGTTCAGTC TCCACC6CTT TGGGCTGTCe TGATATTTCT ATGGTCCGCA AAGGGGGACC 6540 

AAGCrCCCTC GTGCGCrCTC CrGTTGCGAC CCTGCCGCFT ACCGGATACC TGTCGGCCTT 6600 

TTCGAGGGAG CACGCGAGAG GACAAGGaG GGACGGCGAA TGGCCTATGG ACAGGCGGAA 6600 

TGTCCCTTCG GGAAGCGTGG CGCTTTCTCA TAGCTCACGC TGTAGGTATC TCAGTTCGGT 6660 

AGAGGGAAGC CCTTCGCACC GCGAAAGAGT ATCGAGTGCG ACATCCATAG AGTCAAGCCA 6660 

GTAGGTCGTT CGCTCCAAGC TGGGCTGTGT GCACGAACCC CCCGTTCAGC CC6ACCGCTG 6720 

CATCCAGCAA GCGAGGTTCG ACCCGACACA C6TCGTTGGG GGGCAAGTCG GGCTGGCGAC 6720 

CGCCTTATCC GGTAACTATC GTCTTGAGTC CAACCCGGTA AGACACGACT TATCGCCACT 6780 

GCGGAATAGG CCATTGATAG CAGAACTCAG GTTG6GCCAT TCTGTGCTGA ATAGCGGTGA 6780 

GGCAGCAGCC ACTGGTAACA GGATTAGCAG AGCGAGGTAT GTAGGCGGHG CTACAGAGTT 6840 

CCGTCGTC6G TGACCATTGT CCTAATCGTC TCGCTCCATA CATCC6CCAC GATGTCTCAA 6840 

OTGAAGTGG TGGCCTAACT ACGGCTACAC TAGAAGAACA GTATTTGGTA TCTGCGCTCT 6900 

GAACTTCACC ACCGGATTGA TGCCGAT6TG ATCTTCTTGT CATAAACCAT AGACGCGAGA 6900 

GCTGAAGCCA GTTACCrrCG GAAAAAGAGT TGGTAGaCT TGATCCGGCA AACAAACCAC 6960 

CGAOTCGGT CAAT6GAAGC CTTTTTCTCA ACCATCGAGA ACTAGGCCGT TrGTrTGGTG 6960 

CGCTGGTAGC GGTGGnTTT TTGTTTGCAA GCAGCAGATT ACGCGCAGAA AAAAAGGATC 7020 

6CGACCATCG CCACCAAAAA AACAAACGTT CGTCGTCTAA T6CGCGTCTT TTT7TCCTAG 7020 
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TCAAGAAGAT CCTTTGATCT TTTCTACGGG GTCTGACGCT CAGTGGAACG AAAACTCACG 7080 

AGTTCTTCTA GGAAACTAGA AAAGATGCCC CAGACTGCGA GTCACOTGC nTTGAGTBC 7080 

TTAAGGGATT UGGTCATGA GA7TATCAAA AAGGATCTTC ACCTAGATCC TTTTAAATTA 7140 

AATTCCCTAA AACCAGTACT aAATAGTTT TTeCTAGAAG TGGATCTAGG AAAATTTAAT 7140 

AAAAT6AAGT TTGC6GCCGC AAATCAATCT AAAGTATATA TGAGTAAACT TGGTCTGACA 7200 

7TTTACTTCA AACGCCGGCG TTTAGnAGA TTTCATATAT ACTCATTTGA ACCAGACTGT 7200 

GTTACCAAT6 CTTAATCAGT GAGGCACCTA TCTCAGCGAT, CTGTCTATTT CGTTCATeCA 7260 

CAATGGTTAC GAATTAGTCA CTCCGTCGAT AGAGTCGCTA GACAGATAAA GCAAGTAGGT 7260 

TAGTTGCCTG ACTCCCCGTC GTGTAGATAA CTACGATACG GGAGGGCTTA CCATCTGGCC 7320 

ATCAACG6AC TGAG6GGCAG CACATCTATT GATGCTATGC CCTCCCGAAT GGTAGACC6G 7320 

CCAGTGaGC AATGATACCG CGAGACCCAC GCTCACCGGC TCCAGAITTA TCAGCAATAA 7380 

GGTCACGACG TTACTATGGC GCTCTGGGTG CGAGTGGCCG AGGTCTAAAT AGTCGmTT 7380 

ACCAGCCAGC CGGAAGGGCC GAGC6CAGAA GT6GTCCTGC AACTTTATCC GCCTCCATCC 7440 

TGGTC6GTCG GCCTTCCCGG CTCGGGTCTT CACCAGGACG TTGAAATAGG CGGAGGTAGG 7440 

AGTCTATTAA TreTTGCCGG GAAGCTAGAG TAAGTAGHC 6CCAG7TAAT AGTTTGCGCA 7500 

TCAGATAATT AACAACGGCC CTTCGATCTC ATTCATCAAG CGGTCAATTA TCAAACGCGT 7500 

ACGTTGTTGC CATTGCTACA GGCATCGTGG TGTCACGCTC GTCGnTGGT ATGGdTCAT 7560 

TGCAACAACG GTAACGATGT CCGTAGCACC ACAGTGCGAG CAGCAAACCA TACCGAAGTA 7560 

TCAGCTCCGG TTCCCAACGA TCAAGGGGAG TTACAT6ATC CCCCATGTTG TGCAAAAAAG 7620 

AGTCGAGGCC AAGGGTTGCT AGTTCCGCTC AATGTACTAG GGGGTACAAC ACGIIMIIC 7620 

CGGTTAGCTC CTTCGGTCCT CCGATCG7TG TCAGAAGTAA GTTGGCCGCA GTGTTATCAC 7680 

GCCAATCGAG GAA6CCAGGA GGCTAGCAAC AGTCTTCATT CAACCGGCGT CACAATAGTG 7680 

TCATGGTTAT 66CAGCACTG CATAAHCTC TTACTGTCAT GCCATCCGTA AGATGCTTTT 7740 

AGTACCAATA CCGTCGTGAC GTATTAAGAG AATGACAGTA CGGTAGGCAT TCTACGAAAA 7740 

CTGTGACTGG TGAGTACTCA ACCAAGTCAT TCTGAGAATA GTGTATGCGG CGACCGAGTT 7800 

GACACTGACC ACTCATGA6T TGGTTCAGTA AGACTCTTAT CACATACGCC 6CTGGCTCAA 7800 
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GCTCTTGCCC GGCGTCAATA CGGGATAATA CCGCGCCACA TAGCAGAACT TTAAAAGTGC 7860 

CGAGAACGGG CC6CAGTTAT GCCCTATTAT GGC6CGGTGT ATCGTCrTGA AATTTTCACG 7860 

TCATCATTGG AAAACGTTCr TCGGGGCGAA AACTCTCAAG GATCTTACCG CTGITGAGAT 7920 

AGTAGTAACC mTBCAAGA AGCCCCGCTT TTGAGAGTTC CTAGAATGGC GACAACTaA 7920 

CCAGTTCGAT GTAACCCACT CGTGCACCGA ACTGATCTTC AGCATCTTTT ACTTTaCCA : 7980 

GGTCAAGCTA CATTGGSTGA GGACGTGGGT TGACTAGAAG TCGTAGAAAA TGAAAGTGGT 7980 

GCGTTTCTGG GJGAGCAAAA ACAGGAAGGC AAAATGCCGC AAAAAAGGGA ATAAGGGCGA 8040 

CGCAAAGACC CACTCGimT 7GTCCTTCCG TTTTACGGGG IlilllCCCT TATTCCCGGT 8040 

CACGGAAATG TTGAATACTC ATACTCTTCC mTTCAATA TTAnGAAGC ATTTATCAGG 8100 

GTGCCTTTAC AACTTATGAG TATGAGAAGG AAAAAGTTAT AATAACTTCG TAAATAGTCC 8100 

GTTATrffTCT CATGAGCGGA TACATATTTG AATGTATTTA GAAAAATAAA CAAATAGGGG 8160 

CAATAACAGA GTACTCGCCT ATGTATAAAC TTACATAAAT CmTTATTT GnTTTATCCCC 8160 

TTCCGCGCAC ATTTC 8175 

AAGGCGCGTG TAAAG 8175 
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CTGCAGCCTG AATATGG6CC AAACAGGATA TCTGTGGTAA GCAGTTCCTG CCCCGGCTCA 60 

6ACGTCGGAC TTATACCCGG TTTGTCCTAT AGACACCATT CGTCAAGGAC GGGGCCGAGT 60 

GGGCCAAGAA CAGATGGAAC AGQGAATAT GGGCCAAACA GGATATCTGT GGTAAGCAGT 120 

CCCGGTTCTT GTCTACCTTG TCGACTTATA CCCGGTTrGT CCTATAGACA CCATTCGTa 120 

TCCTGCCCCG GCTCAGGGCC AAGAACAGAT GGTCCCCAGA TGCGGTCCAG CCCTCAGCAG 180 

AGGACGG6GC CGAGTCCCGG TTCJTGTCTA CCAG6GGTCT ACGCCAGGTC GGGASTCGTC 180; 

TTTCTAGAGA ACCATCAGAT Gm^CCAGGG TGCCCCAA6G ACCTGAAATG ACCCTGnnGGC 240 

AAAGATCTCT TG6TAGTCTA CAAAGGTCCC ACGGGGTTCC TGGACTTTAC TGGGACACGG . 240 

TTATTTGAAC TAACCAATCA GTTCGCTTCT CGCTTCTGTT CGC6CGCTTC TGCTCCCCGA 300 

AATAAACTTG ATTGGTTAGT CAAGCGAAGA 6CGAAGACAA GCGCGCGAAG ACGAGGGGCT 300 

GCTCAATAAA A6A6CCCACA ACCCCTCACT CGGGGCGCCA GTCCTCCGAT TGACTGACTC 360 

CGAGTTATTT TCTCGGGTGT TGGGGAGTGA GCCCCGCGGT CAGGAGGCTA ACTGACTCAG 360 

GCCCGGGTAC CCGTCTATCC AATAAACCCT CTTGCAGTTG CATCCGACTT GTGGTCTCGC 420 

CGGGCCCATG GGCACATAGG TTATTTGGGA GAACGTCAAC GTAGGCTGAA CACCAGAGCG 420 

TGTTCCTTGG GAGGGTCTCC TCTGAGTGAT T6ACTACCCG TCAGCGGGGG TCTTTCATTT 480 

ACAAG6AACC CTCCCAGAGG AGACTCACTA ACTGATGGGC AGTCGCGCCC AGAAAGTAAA 480 

GGGGGCTCGT CCGGGATCGG GAGACCCCTG CCCAGGGACC ACC6ACCCAC CACCGGGAGG 540 

CCCCCGAGCA G6CCCTAGCC CTQGGGGAC GGGTCCCTGG TGGCTGGGTG GTGGCCaCC 540 

CAAGCTGGCC AGCAACTTAT CTGTOTCTGT CC6ATTGTCT AGTGTCTATG ACTGATnTA 600 

GTTCGACCGG TCGrTTGAATA GACACAGACA GGCTAACAGA TCACAGATAC TGACTAAAAT 600 

TGCGCCTGCG TCGGTACTAG TTAGCTAACT AGCTCTGTAT CTGGCGGACC CGTGGTGGAA 660 

ACGCGGACGC AGCCATGATC AATCGATTGA TCGAGACATA GACCGCCTGG GCACCACCTT 660 

CTGACGAGTT CTGAACACCC GGCCGCAACC CTGGGAGACG TCCCAGGGAC TTTGGGGGCC 720 

GACTGCTCAA 6ACTTGTGG6 CCGGCGTTGG GACCCTCTGC AGGGTCCCTG AAACCCCCQG 720 

GTTTTTGTGG CCCGACCTGA GGAAGGGAGT CGATGTGGAA TCCGACCCCG TCAGGATATG 780 

CAAAAACACC GGGCTGGACT CCTTCCCTCA GCTACACCTT AGGaGGGGC AGTCQATAC 780 
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TGGTTCT6GT AGGAGACGAG A'VCCTAAAAC AGTTCCCGCC TCCGTCT6AA TmTGCTTT 840 

ACCAAGACCA TCCTCTGCTC TreGATUre TCAAGQGCGG AGGCAGACTT AAAAAC6AAA 840 

CGGnnTGGAA CCGAA6CCGC GCGTCTTCTC TGCTGCAGGA TCSTTCreTS TrGTCTCTlGT 900 

GCCAAACCTT GGGTCGGCG CGCAGAACAG ACGACGTCGT AGCAAGACAC AACAGAGACA 900 

CT6ACTGTGT TrCTGTATTT GTCTGAAAAT TAGGGCCAGA aGTTACCAC TCCCTTAAGT 960 

GACTGACACA AAGACATAAA CAGACmTA ATCCC6GTGT GACAATGGTG AGGGAATTCA 960 

TTGACCTTAG GTAACTGGAA AGATBTCGAG CGGCTCGCTC ACAACCAGTC GGTAGATGTC 1020 

AACTGGAATC CATTGACCTT TCTACAGCTC GCCGAGCGAG TGTTGGTCAG CCATCTACAG 1020 

AAGAAGAGAC GTTGGGTTAC CTTCTGCTCT GCAGAATGGC CAACCTTTAA CGTCGGATCG 1080 

ITCnCTCTG CAACCCAATG GAAGACGAGA CGTCTTACCG GTTGGAAATT GCAGCCTACC 1080 

CCGCGAGACG GCACCTTTAA CCGAGACCTC ATCACCCAGG TTAAGATCAA GCTCTnTCA 1140 

GGC6CTCT6C CGTGGAAATT GGCTCTGGAG TAGTGGGTCC AATTCrAGTT CCAGAAAAGT 1140 

CCTGGCCCGC ATGGACACCC AGACCAGGTC CCCTACATCG TGACCTGGGA AGCCTTGGCr 1200 

GGACCGGGCG TACCTGTGGG TCTGGTCCAG GGGATGTAGC ACTGGACCCT TCGGAACCGA 12O0 

TTTGACCCCC CTCCCTGGGT CAAGCCCTTT GTACACCCTA AGCCTCCGCC TCCrCTTCCT 1260 

AAACTGGGGG GAGGGACCCA GITCGGGAAA CATGTGGGAT TCGGAGGCGG AGGAGAAGGA 1260 

CCATCCGCCC CGTCTCTCCC CCTTGAACCT CCTCGTTCGA CCCCGCCTCG ATCCTCCC7T 1320 

GGTAGGCGGG GCAGAGAGGG GGAACTTGGA GGAGCAAGCT GGGGCGGAGC TAGGAGGGAA 1320 

TATCCAGCCC TCACTCCTTC TCTAGGCGCC GGCCGCTCTA GCCCATTAAT ACGACTCACT 1380 

ATAGGTCGG6 AGTGAGGAAG AGATCCGCGG CCGGCGAGAT CGGGTAATTA TGCTGAGTIGA 1380 

ATAGGGCGAT TCGAACACCA TGCACCATCA TCATCATCAC GTCGACGAAC AGAAACTCAT 1440 

TATCCCGCTA AGCTTGTGGT ACGTGGTAGT AGTAGTAGTG CAGCTGCTTG TCTTTGAGTA 1440 

TTCCGAAGAA GACCTACTCG AGATGGGCGT GATTACGGAT TCACTGGCCG TCGTTTTACA 1500 

AAGGCTTCrr CTGGATGAGC TCTACCCGCA CTAATGCCTA AGTGACCGGC AGCAAAAT6T 1500 

ACGTCGTGAC TGGGAAAACC CTGGCGTTAC CCAAC7TAAT CGCCTTGCAG CACATCCCCC 1560 

TGCAGCACTG ACCCTTTIGG GACCGCAATG GGTTGAATTA GCGGAACGTC GTGTAGGGGG 1560 
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TTTCGCCAGC TG6CGTAATA GCGAAGAGGC CCGCACC6AT CGCCCTTCCC AACAGTTACG 1620 

AAAGCGGTCG ACCGCATTAT C6CTTCTCCG GGCGTGGCTA 6CGGGAAGGG TTGTCAAT6C 1620 

CAGCCTGAAT G6CGAATGGC GCTTT6CCTG GTTTCCGGCA CCAGAAGCGG TGCCGGAAAG 1680 

GTCGGACTTA CCGCTTACC6 CGAAACGGAC CAAAGGCCST GGTCTTCGCC ACGGCOTTC 1680: 

aGGCTGGAG T6CGATCTTC CTGAGGCCGA TACTGTGGTC GTCCGCTCAA ACTGGCAGAT 1740 

6ACCGACCTC ACGCTAGAA6 GACTCCG6CT AT6ACAGCAG CAGGGGAGTT TGACCGTCTA 1740 

GCACGGTTAC GATGCGCCCA TCTACACCAA CGTGACCTAT CCCATTACGG TCAATCCGCC 1800 

CGTGCCAATG CTACGCGGGT AGATGTGGTT GGACTGGATA 6GGTAATGCC AGTTAGGCGG 1800 

GTTTGTTCCC ACGGAGAATC CGACGGGTTG TTAaCGCTC ACATITAATG TTGATGAAAG 1860 

CAAACAA6GG TGCCTCTTAG GCTGCCCAAC AATGAGCGAG TGTAAATTAC AACTACTTTC 1860 

CTGGCTACAG GAAGGCCAGA CGCGAA7TAT TTTTGATGGC GTTAACTCGG CGITTCATCT 1920 

GACCGATGTC CTTCCG6TCT GCGCTTAATA AAAACTACCG CAATTGAGCC 6CAAAGTAGA 1920 

GTGGTGCAAC GGGCGCTQGG TCGGTTACGG CCAGGACAGT CGTTTGCCGT CTGAATTTGA 1980 

CAeCACGTTG CCCGCGACCC AGCCAATGCC GGTCCTGTCA GCAAACGGCA GACTTAAACT 1980 

CCTGAGCGCA TTTTTACGCG CCGGAGAAAA CCGCGTCGCG GTGATGGTGC TGCGCTGGAG: 2040 

GGACTCGCGT AAAAATGCGC GGCCTCTTTT GGCGGAGCGC CACTACCACG ACGCGACCTC 2040 

TGACGGCAGT TATCTGGAAG ATCAGGATAT 6T6GCG6ATG AGCGGCATTT TCCGTGACGT 2100 

ACTGCCGTCA ATAGACCTTC TAGTCCTATA CACCGCCTAC TCGCCGTAAA AGGCACTGCA 2100 

CTCGTreCTG CATAAACCGA CTACACAAAT CAGC6ATTTC CATGTTGCCA CTCGCnTAA 2160 

GAGCAACGAC GTATrTGGCT GATGTGnTA GTC6CTAAAG GnTACAACGGT GAGCGAAATT 2160 

TGATGATTTC AGCCGCGCTG TACTGGAGGC TGAAGTTCAG ATGTGCGGCG AGTTGCGTGA 2220 

ACTACTAAAG TC6GCGCGAC ATGACCTCCG ACTTCAAGTC TACACGCCGC TCAACGCACT 2220 

CTACCTACGG GTAACAGTTr CnTATGGCA GGGTGAAACG CAGGT€6CCA GCGGCACCGC 2280 

GATGGATGCC CATTGTCAAA GAAATACCGT CCCACTTTGC GTCCAGCGGT CGCCGTGGCG 2280 

GCCTTTCGGC GGTGAAATTA TCGATGAGCG TGGTGGTTAT GCCGATCGCG TCACACTACG 2340 

CGGAAAGCCG CCACTTTAAT AGCTACTCGC ACCACCAATA CGGCTAGCGC AGTGTGATGC 2340 
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TCTGAACGTC GAAAACCCGA /ACTGTGGAG CGCCGAAATC CCGAATCTCT ATCUTGCGGT 2400 

AGACrreCAG CTTTTGGGCT TTGACACCTC GCGGCTTTAG GGCITAGAGA TAGCACGCCA 2400 

QGTTGAACTG CACACCGCCG ACGGCACGCT GATTGAAGCA GAAGCCTGC6 ATGrCGGfTTT 2460 

CCAACTT6AC GTGTGGCGGC TGCCGTGCGA CTAACTTCGT CTTCGGACGC TACAGCCAAA 2460 

CCGC6AG6TG CGGATTGAAA ATGGTGTGCT GCTGCTGMC GGCMGCCGT TGCTGATTCG 2520 

GGCGCTCCAC GCaAACITr TACCAGACGA CGAC6ACTTG CCGTTCGGCA ACGACTAAGC 2520 

AGGCGITAAC CGTCACGAGC ATCATCCTCT GCATGGTCAG GTCATGGA7G AGCAGACGAT 2580 

TCCGCAATTG GCAGTGCTCG TAGTAGGAGA CGTACCAGTC CAGTAGCTAC TCGTCTGCTA 2580 

GGTGCAG6AT ATCCTGCTGA TGAAGCAGAA CAACTTTAAC GCCGT6CGCT GTTCGCATTA 2640 

CCACGTCCTA TAGGACGACT ACTTCGTCnr GnGAAATTG CG6GACGCGA CAAGCGTAAT 2640 

TGCGAACCAT CCGCT6TGGT ACACGCTGT6 C6ACGGCTAC GGCCTGTATG TGGTGGAT6A 2700 

AGGCTTGGTA GGCGACACCA TGTGCGACAC GCTGGCGATG CCGGACATAC AGCACCTACT 2700 

AGCCAATATT GAAAGGGACG GCATGGTGCC AATGAATCGT CTGACCGATG ATGCGCGCTG 2760 

TCG6TTATAA CTTTGGGTGG CGTAGCACGG TTACTTAGCA GAaGGCTAC TAGGCGOAG 2760 

GCTACGGGGG ATGAGCGAAG GGGTAACGGG MTGGTGGAG GGGGATGGTA ATCACGGGAG 2820 

CGATGGGGGG TACTGGGTT6 CGGATTGCGC TTACGAGGTG 6GGCTAGCAT TAGTSGGCTG 2820 

TGTGATCATC TGGTGGGTGG GGAATGAATG AGGGGAGGGG 6GTAATCAGG ACGGGGTGTA 2880 

AGAGTAGTAG AGGAGGGAGG GGTTAGTTAG TGGGGTGGGG GGATTAGTGG TGGGCGAGAT 2880 

TCGCTGGATC AAATGTGTG6 ATCCTTCCCG CCCGGTGCAG TATGAAGGGG GCGGAGGCGA 2940 

AGG6AGCTAG TTTAGAGAGG TAGGAAGGGG GGGGGAGGTG ATAGTTCCGC CGCCTGGGCT 2940 

CAGCACGGCG AGCGATATTA nTGGCCGAT 6TACGCGGGG GTGGATGAAG ACCAGGGCTT 3000 

GTGGTGGCGG TG6CTATAAT AMCGGGCTA CATGGGGGCG CAGCTACTTC TGGTCGGGAA 3000 

CCCGGC7GTG CCGAAATGGT GCATCAAAAA ATBGCnTGG GTAGCTTGGAG AGACGGGCCG 3060 

GGGGGGAGAC GGCnTAGGA GGTAtilllll TACCGAAAGG GATGGAGCTC TCTGCGCGGG 3060 

GCTGATGGTT TGGGAATACG GGGAGGGGAT 6GGTAAGAGT CTTGGGGGTT TGGCTAAATA 3120 

CGAGTAGGAA AGGCHATGG GGGTGCGGTA GGGATTGTGA 3AACGGGCAA AGGGATTTAT 3120 
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CTGGCAGGCG 7TTCGTCAGT ATCCCCGTTT ACAGGGCGGC TTCGTCTGGG ACreGGTCGA 3180 

GACCGTCCGC AAAGCAGTCA TAGGGGCAAA TGTCCCGCCG AAGCAGACCC TGACCCACCT 3180 

TCAGTCGCT6 ATTAAATATG ATGAAAACGG CAACCCGTISG TCGGCTTACG GCGGrGATTT 3240 

AGTCAGC6AC TAATTTATAC TACTTTTGCC GTTGGGCACC AGCCGAATGC CGCCACTAAA 3240 

TGGCGATACG CGGAACGATC GCCAGTTCTG TATGAACGGT CTGGTCTTTG CCGACCGCAC 3300 

ACCGCTATGC GGCTTGCTAG CGGTCAAGAC ATACTTGCCA 6ACCAGAAAC G6CTGGCGT6 3300 

GCCGCATCCA GCGCTGACGG AAGCAAAACA CCAGGAGCAG TTTTTCCAGr TCCGTITATC 3360 

CGGCGTAGGT CGCGACTGCC TTCGIII IGI GGTCGTCGTC AAAAAGGTCA AGGCAAATAG 3360 

CGGGCAAACC ATCGAAGTGA CCAGCGAATA CCTG7TCCGT CATAGCGATA ACGAGCTCCT 3420 

GCCCGTTT66 TAGCTTCACT GGTCGCTTAT GGACAAGGGA GTATCGCTAT TGCTCGAGGA 3420 

6CACTG6ATG GTGGCGCTGG ATGGTAAGCC GCTGGCAAGC 6GTGAAGTGC CTCT6GATGT 3480 

CGTGACCTAC CACCGCGACC TACCATTCGG CGACCGTTCG CCAOTCACG GAGACCTACA 3480 

CGCTCCACAA GGTAAACAGT TGATTGAACT GCCTGAACTA CCGCAGCCGG AGAGCGCCGG 3540 

GCGAGGTGTT CCATTTGTCA ACTAACTTGA CGGACTTGAT GGCGTCGGCC TaCGCGGCC 3540 

GCAACTCTGG CTCACAGTAC GCGTAGTGCA ACCGAAC6CG ACCGCATGGT CAGAAGCCGG 3600 

CGTTGAGACC GAGTCTCATG CGCATCACGT T6GCTTGCGC TGGC6TACCA GTCTTCGGCC 3600 

GCACATCAGC GCCTGGCAGC AGTGGCGTCT GGCGGAAAAC CTCAGTGTGA CGCTCCCCGC 3660 

CGTGTAGTCG CGGACCGTCG TCACCGCAGA CC6CCTTTTG GAGTCACACT GC6AG6GGCG 3660 

CGCGTCCCAC GCCATCCCGC ATCTGACCAC CAGCGAAATG GATmTGCA TCGAGCTGGG 3720 

GC6CAGGGTG CGGTAGG6CG TAGACTGGTO GTCGCTTTAC CTAAAAAC6T AGCTCGACCC 3720 

TAATAAGCGT TGGCAATTTA ACCGCCAGTC AGGCTITCTT TCACAGATGT GGATTGGCGA 3780 

ATTATTCGCA ACCGTTAAAT TGGCGGTCAG TCCGAAAGAA AGTGTCTACA CCTAACCGCT 3780 

TAAAAAA CAA CT6CTGACGC CGCTGCGCGA TCAGTTCACC CGTGTCGATA GATCTGGAGG 3840 

Al 1 1 1 1 IGI I GACGACT6CG 6CGACGCGCT AGTCAA6TG6 GCACAGCTAT CTAGACCTCC 3840 

TGGTGGCAGC AGGCCTTGGC GCGCCGGATC CTTAATTAAC AATTGACCGG TAATAATAGG 3900 

ACCACCGTCG TCCGGAACCG CGCGGCCTAG GAATTAATTG TTAACTGGCC ATTATTATCC 3900 
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TAGATAAGTG ACTGATTAGA TGCATTTCGA CTAGATCCCT C6ACCAATTC CGGTTATm 3960 

ATCTATTCAC T6ACTAATCT ACCTAAAGCT GATCTAGGGA 6CTGGTTAAG GCCAATAAAA 3960 

CCACCATATT GCGGTCTTTT GGCAATGTGA GGGCCCQGAA ACCTGGCCCT fflCTTCTTlBA 4020 

GGTGGTATAA CGGCAGAAAA CCGlTACACr CCCGGGCCTT TGGACCGGGA CAGAAGAAd 4020 

CGAGCATTCC TAGGGGTCTT TCCCCTCTCG CCAAAGGAAT GCAAGGTCTG 1TGAATGTCG 4080 

6CTCGTAA6G ATCCCCAGAA AGGGGAGAGC GGnTCOTA CGTTCCAGAC AACTTACAGG 4080 

TGAAGGAAGC AGTTCCTCre GAAGCTTCTT GAAGACAAAC AACGTCTGTA GCGACCCTTF 4140 

ACTTCCTTCG TCAAGGAGAC CTTCGAAGAA CTTCTCTTTG TTGCAGACAT CGCTGGGAAA 4140 

6CAGGCAGCG 6AACCCCCCA CCTGGCGACA GGTGCCTCTG CGGCCAAAAG CCACGTGTAT 4200 

CGTCCGTC6C CTTGGGGG6T GGACCGaGT CCACGGAGAC GCCGGTnTC GGTGCACATA 4200 

AAGATACACC TGCAAAGGCG GCACAACCCC AGTGCCACGT TGTGAGTTGG ATAGTrGlGG 4260 

TTCTATGTGG ACGTTTCCGC CGTGTTBGGG TCACGGTGCA ACACTCAACC TATCAACACC 4260 

AAAGAGTCAA ATGGCTCTCC TCAAGCGTAT TCAACAAGGG GCTGAAGGAT GCCCAGAAGG 4320 

nrCTCAGTT TACCGAGAGG AGTTGGCATA AGTTGTTCCC CGACTTCCTA CGGGTCnCC 4320 

TACCCCATTG TATGGGATCT GATCTGGGGC CTCGGTGCAC ATGCnTACA TGIGI I lAGT 4380 

ATGGGGTAAC ATACCCTAGA CTAGACCCC6 GA6CCACGTG TACGAAATGT ACACAAATCA 4380 

CGAGGTTAAA AAACGTCTAG GCCCCCCGAA CCAC6GGGAC GTGGTnTCC TTT6AAAAAC 4440 

GCTCCAATTT TTTGCAGATC CGGGGGGCTT GGTGCCCCTG CACCAAAAGG AAACTTTTTG 4440 

ACGAT6ATAA TACCATGAAA AAGCCTGAAC TCACCGCGAC GTCTGTCGAG AAGTTTCTGA 4500 

TGCTACTATT ATCGIACTTT TTCGGACTTG AGTGGCGCTG CAGACAGCTC TTCAAAGACT 4500 

TCGAAAAGTT CGACAGCGTC TCCGACCTGA TGCAGCTCTC GGAGGGCGAA GAATCrCGTG 4560 

AGCmrCAA GCTGTCGCAG AGGCTGGACT ACGTCGAGAG CCTCCCGGT CTTAGAGCAC 4560 

CnrCAGCTT CGATSTAGGA QBGCGTGGAT ATGTCCTGCG GGTAAATAGC TGCGCCGATG 4620 

6AAAGTCGAA GCTACATCCT CCCGCACCTA TACAGGACGC CCATTTATCG ACGCGGCTAC 4620 

GTTTCTACAA AGATCGTTAT G7TTATCG6C AC7TTGCATC GGCCGCGCTC CCGATTCCGG 4680 

CAAAGATGTT TCTAGCAATA CAAATAGCCG TGAAACGTAG CCGGCGCGAG G6CTAAG6CC 4680 
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AAGTGCTTGA CATTGGGGAA TTTAGCGAGA GCCTGACCTA TTGCATCTCC C6CCGTGCAC 4740 
nCACGAACT GTAACCCCTT AAATCGCTCT CGGACTGGAT AACGTAGAS6 GCGGCACGTG 4740 

AGGGTGTCAC 6TTGCAAGAC CTGCCTGAAA CCGAACTSCC CGCTGTTCTG CAfiCCGGTCG 4800 
TCCCACAGTG CAACGTTCTG GACGGACTTT GGCTTGACGG 6CGACAAGAC GTCGGCCAGC 4800 

CGGAGGCCAT GGATGCGATC GCTGCGGCCG ATCTTAGCCA GACGAGC6GG TTCGGCCCAT 4860 
GCCTCC6GTA CCTAC6CTAG CGACGCCGGC TAGAATCGGT CTGCTCGCCC AAGCCGGGTA 4860 

TCGGACCGCA AGGAATCGGT CAATACACTA CATGGCGTGA TTTCATATCC GCGATTGCTG 4920: 
AGCCTCGCGT TCCTTAGCCA GTTATGTGAT 6TACCGCACT AAAGTATACG CGCTAAt^C 4920 

ATCCCCATGT GTATCACTGG CAAACTGTGA TGGACGACAC CGTCAGTGCG TCCGTCGCGC 4980 

TAGG6GTACA CATAGTGACC GTTTGACACT ACCTGaGTG 6CAGTCACGC AGGCAGCGC6 4980 

AGGCTCTCGA T6AGCTGATG CTTTGGGCCG AGGACTGCCC CGAAGTCCGG CACCTCGTGC 5040 

TCCGAGAGCT ACTCGACTAC GAAACCCGGC TCCT6ACGGG GOTCAGGCC GTGGAGCACG 5040 

ACGCGGATTT CGGCTCCAAC AATGTCCTGA CGGACAATGG CCGCATAACA GCGGJCATTG 5100 

TGCGCCTAAA GCCGAGGTTG TTACAGGACT GCCTGTTACC GGCGTATTGT CGCCAGTAAC 5100 

ACTGGAGCGA GGC6ATGTTG GGGGATTCCC AATACGAGGT CGCCAACATC TTCTTCTGGA 5160 

TGACCTC6CT CCGCTACAAG CCCCTAAGGG 7TATGCTCCA GCGGTTGTAG AAGAAGACCT 5160 

GGCCGTGGTT GGCTTGTAT6 GAGCAGCAGA CGCGCTACTT CGAGCGGAGG CATCCGGAGC 5220 

CCGGCACCAA CCGAACATAC CTCGTCGTCT GCGCGATGAA GCTCGCCTCC GTAGGCCTCG 5220 

TTGCAGGATC GCCGCGGCTC CGGGCGTATA TGCTCCGCAT TSGTCTTGAC CAACTCTATC 5280 

AACGTCCTAG CGGCGCCGAG GCCCGCATAT ACGAGGCCTA ACCAGAACTG GTTGAGATAG 5280 

AGAGCTTGGT TGACGGCAAT TTCGATGATG CAGCTTGGGC GCAGGGTCGA TGCGACGCAA 5340 

TCTCGAACCA ACTGCCGTTA AAGCTACTAC GTCGAACCC6 CGTCXCAGCT ACGCTGCGTT 5340 

TCGTCCGATC CGGAGCCGG6 ACTGTCGGGC GTACACAAAT CGCCCGCAGA AGCGCGGCCG 5400 

AGCAGGCTAG 6CCTCGGCCC TGACAGCCCG CATGTGTTTA GCGQGCGTCT TCGCGCCGGC 5400 

TCTGGACCGA TGGCTGTGTA GAAGTACTCG CC6ATAGTGG AAACCGACGC CCCAGCACTC 5460 

AGACCTGGCT ACCGACACAT CTTCATGAGC GGCTATCACC TUGGaGCG GGGTCGTGAG 5460 
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GTCCGAGGGC AAAGGAATAG AGTAGATGCC 6ACC666ATC TATCGATAAA ATAAAAGATT 5520 

CAG6CTCCCG TrTCCTTATC TCATCTACG6 CTGGCCCTA6 ATAGCTATTT TATnTCTAA 5520 

TTATTTAGTC TCCAGAAAAA GSGGGGAATG AAGAGCGCAA CCTCTAGSTT TGGCAAGCTA 5580 

AATAAATCAG AGGTCI 1 1 1 1 CCCCCC7TAC TTTCTGGGGT GGACATCCAA ACCGTTCGAT 5580 

GCTTAAGTAA CGCCAT7TTG CAAGGCATGG AAAAATACAT AACTGAGAAT AGAGAAGTTC 5640 

CGAATTCATT GCGGTAAAAC GITCCGTACC TT7TTATGTA TTGACTCTTA TCTCTTCAAfi 5640 

AGAtCAAGGT CAGGAACAGA TGGAACAGCT GAATATGGGC CAAAGAGGAT ATCTGTGGTA 5700 

TCTAGtrCCA GTCCTTGTCT ACCTTGTCGA CTTATACCCG GTTTGTCCTA TAGACACCAT 5700 

AGCAGTTCCr GCCCCGGCTC AGGGCCAAGA ACAGATGGAA CAGCTGAATA TGGGCCAAAC 5760 

TCGTCAAGGA CGGGGCCGAG TCCCGGTTa TGTCTACCTT GTCGACTTAT ACCCGGITTG 5760 

AGGATATCTG TG6TAAGCAG TTCCTGCCCC GGCTCAGGGC CAAGAACAGA TGGTCCCCAG 5820 

TCCTATAGAC ACCATTCGTC AAGGACGGGG CCGAGTCCCG GTrCTTGTCT ACCAGGGGTC 5820 

ATGCGGTCCA 6CCCTCAGCA GnTCTAGAG AACCATCAGA TGTTrCCAGG GTGCCCCAAG 5880 

TACGCCAGGT CGGGAGTCGT CAAAGATCTC nGGTAGTCT ACAAAGGTCC CACGGGGTTC 5880 

GAGCTGAAAT GACCC7CTGC CTTATTTGAA CTAACCAATC AGTTCGCFTC TCGCTTCTGT 5940 

CTGGACTTTA CTGGGACACG GAATAAACTT GATTGGTTAG TCAAGCGAAG AGCGAAGACA 5940 

TCGCGCGCTT CTGCTCCCCG AGCTCAATAA AAGAGCCCAC AACCCCTCAC TCGGGGCGCC 6000 

AGCGCGC6AA GACGAGGGGC TCGAGTTATT TrCTCGGGTG nGGGGAGTG AGCCCCGCGG 6000 

AGTCCTCCGA TTGAaGAGT CGCCCGGGTA CCCGTGTATC CAATAAACCC TCITGCAGTT 6060 

TCAGGAGGCT AAC7GACTCA GCGGGCCCAT GGGCACATAG GTTATTTGGG AGAACGTCAA 6060 

GCATCCGACT TGTGGTCTCG CTGTTCnTG GGAGGGTCTC CTaGAGTGA TTGACTACCC 6120 

CGTAGGCTGA ACACCA6AGC GACAAGGAAC CCTCCCAGAG GAGACTCACT AACTGATGGG 6120 

CTCAGCGGGG GTCTTTCATr CATGCAGCAT GTAT€AAAAT TAATTTGGTT llllliCTTA 6180 

CAGTCGCCCC CAGAAAfiTAA GTACGTCGTA CATAGTrTTA ATTAAACCAA AAAAAAGAAT 6180 

AGTATTTACA TTAAATGGCC ATAGTTGCAT TAATGAATCG GCCAACGCGC GGGGAGAG6C 6240 

TCATAAATGT AATTTACCGG TATCAACGTA ATTACTTAGC CGGTTGCGCG CCCCTCTCCG 6240 
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GGTITGCGTA TTGGCGCTCT TCC6C7TCCT CGCTCACTGA CTCGCTGC6C TCGGTCGTTC 6300 

CCAAACGCAT MCCGCGAGA AGGCGAAG6A GCGAGTGACT GAGCGACGCG AGCCAGCAAG 6300 

GGCreCGGCG AGCGGTATCA 6CTCACTCAA AGGCGGTAAT ACGGTTATCC ACAGAATCAG 6360 

CCGAGGCCGC TC6CCATAGT CGAGTSASTT TCCGCCATTA TGCCAATAGG TGTCTTAGTC 6360 

GGGATAACGC AGGAAAGAAC ATGTGAGCAA AAGGCCAGCA AAAGGCCAGG AAGCGTAAAA 6420 

CCCTATTGCG TCLIIILIIG TACACTCGTT TTCCGGTCGT TTTCCGGTCC TTGGCATnT 6420 

AGGCCGCeiT GCTGGCGTTT TTCCATAGGC TCCGCCCCCC TGACGAGCAT CACAAAAATC 6480 

TCCGGCGCAA CGACCGCAAA AAGGTATCCG AGGCGGGGGG ACTGCTCGTA GTCTTTTTAG 6480 

GACGCTCAAG TCAGAG6TGG CGAAACCCGA CAG6ACTATA AAGATACCAG GCGTTTCCCC 6540 

CTGCGAGTTC AGTCTCCAGC GCTTTGGGCT GTCCTGATAT TTCTATGGTC CGGAAAGGGG 6540 

GTGGAAGCrC CCTCGTGCGC TCTCCTGTrC CGACCCTGCC GCTTACCGGA TACCTGTCCG 6600 

GACCTTCGAG GGAGCACGCG AGAGGACAAG GCTGGGACGG CGAATGGCCT ATGGACAGGC 6600 

CCTTTCrCCC TTCGGGAAGC GTGGCGCTTT CTCATAGaC ACGCTGTAGG TATCTCAGTT 6660 

GGAAAGAGGG AAGCCCTTCG CACCGCGAAA GAGTATCGAG TGCGACATCC ATAGAGTCAA 6660 

CGGTGTAGGT CGTTCGCTCC AAGCTGGGCT GTGrTGGACGA ACCCCCCGTT CAGCCCGAGC 6720 

GCCACATCCA GCAAGCGAGG TTCGACCCGA CACACGTGCT TGGGGGGCAA GTCGGGCTGG 6720 

GCTGCGCCTT ATCC6GTAAC TATCGTCTTG AGTCCAACCC GGTAAGACAC GACTTATCGC 6780 

CGACGCGGAA TAGGCCATTG ATAGCAGAAC TCAGGTTGGG CCATTCTGTG CTGAATAGCG 6780 

CACTGGCAGC AGCCACTCGT AACAGGAHA GCAGAGCGAG GTATGTAGGC GGTGCTACAG 6840 

GTGACCGTC6 TCGGTGACCA TTGTCCTAAT CGTCTCGCTC CATACATCC6 CCACGATGTC 6840 

AGTTCTTGAA GTGGTGGCCT AACTACGGCT ACACTAGAAG AACAGTATTT GGTATCTGCG 6900 

TCAA6AACTT CACCACCGGA TTGAT6CCGA TGTGATCTTC TTGTCATAAA CCATAGACGC 6900 

CrCTGCTGAA GCCAGTTACC TTCGGAAAAA GAGTTGGTAG CTCTT6ATCC G6CAAACAAA 6960 

GAGACGACTT CGGTCAATGG AAGCCI 1 1 1 1 CTCAACCATC GAGAACTAGG CCGTTTGTTT 6960 

CCACCGCTGG TAGCGGTGGT IIIIIIGIil GCAAGCAGCA GA7TACGCGC A6AAAAAAAG 7020 

GGTGGCGACC ATCGCCACCA AAAAAACAAA CGHCGTCGT CTAATGCGCG T LHHIII C 7020 
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6ATCTCAAGA AGATCCTTTG ATCmTCTA CGGGGTCTGA CGCTCAGTGG MCGAAAACT 7080 
CTAGAGTTCT TCTAGGAAAC TAfiAAAAGAT GCCCCAGACT GC6AGTCACC TreCmTGA 7080 

CACGTTAAGG GATnTGSTC ATCAGATTAT CAAAAAfiGAT CTTCACCTAG ATCCTTTTGC 7140 
GTGCAATTCC CTAAAACCAG TACTCTAATA GnTTTCCTA GAAGTGGATC TAGGAAAAC6 7140 

GGCCGCAAAT CAATCTAAA6 TATATATGAG TAAACTTGGT CTGACAGTTA CCAATGCTTA 7200 
CCGGCGTTTA GTTAGATTrC ATATATACTC ATTTGAACCA GACTGTCAAT GSTTAC6AAT 7200 

ATCA6TGAGG CACCTATQC AGCGATCTGT CTAnTCGTT CATCCATAGT TGCCTGACTC 7260 
TAGTCACTGC GTGGATAGAG TCGCTAGACA GATAAAGCAA GTAGGTATCA ACGGACTCAG 7260 

CCCGTCGTGT A6ATAACTAC GATACGGGAG GGOTACCAT CTGGCCCCAG TGCTGCAATG 7320 
GGGCAGCACA TCTATTGATG CTATGCCCTC CCGAATGGTA GACCGGGGTC ACGACGTTAC 7320 

ATACCGCGAG ACCCACGCTC ACCGGCTCCA GATTTATCAG CAATAAACCA GCCAGCCGGA 7380 

TATGGCGCTC TGGGT6CGAG TGGCCGAGGT CTAAATAGTC GnATTTGGT CGGTCGGCCT 7380 

AGGGCCGAGC GCAGAAGTGG TCCTGCAACT TTATCCGCCT CCATCCAGTC TATTAATTGT 7440 

TCCCGGCTCG CGTCTTCACC AGGACGTTGA AATAGGCGGA GGTAGGTCAG ATAATTAACA 7440 

TGCCGGGAAG CTAGAGTAAG TAGTTCGCCA GTTAATAffrr TGCGCAACGT TGTTGCCATT 750O 

ACGGCCCTTC GATCTCATTC ATCAAGGGGT CAATTATCAA ACGCGnGCA ACAACGGTAA 7500 

GCTACAGGCA TCGTGGTGTC ACGCTCGTCG TTTG6TATGG CTTCATTCAG CTCCGGITCC 7560 

CGATGTCCGT AGCACCACAG TGCGAGCAGC AAACCATACC GAAGTAAGTC GAGGCCAAGG 7560 

CAACGATCAA GGCGAGTTAC ATGATCCCCC A7GTTGTGCA AAAAAGCGGT TAGCTCCTTC 7620 

GITGCTAGTT CCGCTCAATG TACTAGGGGG TACAACACGT TTTnCGCCA ATCGAGGAAG 7620 

GGTCCTCCGA TCGTTGTCAG AAGTAA6TTG GCCGCAGTGT TATCACTCAT G6TTATGGCA 7680 

CCAGGAGGCT AGCAACAGTC TTCATTCAAC CGGCSTCACA ATAGTGAGTA CCAATACCGT 7680 

GCACTGCATA ATTCTCTTAC TGTCATGCCA TCCGTAAGAT GCmTCTGT GACTGGTGAG 7740 

CSreACGTAT TAAGAGAATS ACASTACGGT AGGCATTCTA CGAAAAGACA CTGACCACTC 7740 

TACTCAACCA AGTCAnCTG AGAATAGTGT ATGCGGCGAC CGAGTTGCTC TTGCCCGGC6 7800 

ATGAGnGGT TCAGTAAGAC TCTTATCACA TACGCCGaG GCTCAACGAG AACGGGCCGC 7800 
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TCAATACGGG ATAATACCGC 6CCACATAGC AGAACTTTAA AAGT6CTCAT CATTIGGAAAA 7860 

AGTTATGCCC TATTATGGCG CGGTGTATCG TCTTGAAATT TTCACGAGTA GTAACCmT 7860 

CGTTCrrCGG GGCGAAAACT CTCAASGATC TTACCGCTGT TSAGATCCAG TTCGATGTAA 7920 

GCAAGAAGCC CCGCmTGA SASTTCCTAG AATGGCGACA ACTCTAGGTC AAGGTACATT 7920 

CCCACTCGTG CACCCAACTG ATCTTCAGCA TCTTTTACTT TCACCAGCGT TTCTGGGTGA 7980 

GGGTGAGCAC GTGGGTTGAC TAGAAGTCGT AGAAAAT6AA AGTGGTCGCA AAGACCCACT 7980 

GCAAAAACAG GAAGGCAAAA TGCCGCAAAA AAGGGAATAA GGGCGACACG GAAATGTTGA 804O 

cffmrreTC cttccgtttt acggcgtttt ttcccttatt cccGcrarec ctttacaact 804o 

ATACTCATAC TCTTCCmT TCAATATTAT TGAAGCATTT ATCAGGGTTA TTGTCTCATG 8100 

TATGAGTATG AGAAGGAAAA AGTTATAATA ACTTCGTAAA TAGTCCCAAT AACAGAGTAC 8100 

AGCGGATACA TATTTGAATG TATTTAGAAA AATAAACAAA TAGGGGTTCC GCGCACATTT 8160 

TCGCCTATGT ATAAACTTAC ATAAATCTTT TTAIIIGIII ATCCCCAAGG CGCGTGTAAA 8160 

C 8161 

G 8161 
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